Generating Data for Testing
September 1st, 2008 4:15 pm
Whatever method is chosen to generate the data, it is important to ensure that the generated data is correctly structured and distributed. If the generated data has a normal distribution and the real data does not, any query performance tests will be useless. It is also important to ensure that the generated data has the same table-to-table ratios. So, for example, in a banking data warehouse application, ensure that there is the right ratio of transactions to account, and the correct ratio of accounts to customers. If these ratios are not correct the query execution plans are likely to be different from those in the live system, and so any query performance testing may prove to be of no use.
