The data transfer process is of course fraught with dangers. Data can be delayed because of late processing on the source system. This late processing can be due to system problems, increased data loads such as peak days, or a host of other reasons. Data can also be delayed because of breaks or faults on the network. Whatever the reason, delayed data is a serious risk to the data warehouse. If a key piece of daily data does not make it to the data warehouse until the next day, how is this to be handled? If data is missing, should the available data be made visible to the users? Does that data get rolled into the aggregations, or do you wait until the rest of the data is available before updating the aggregations? These are crucial and difficult questions to answer. There’are no stock answers, because the effect of missing data will depend on the busine&s and the purpose of the data warehouse.
Dangers Involved in a Data Transfer Process
August 27th, 2008 9:25 pm
