Dumpster diving for data

0

The other day across the street at the gas station there was a bear in the dumpster. This bear was definitely dumpster diving in the hopes of finding a few nuggets of great stuff to eat prior to the "big sleep." Everyone was clapping to try to get the bear to leave, and eventually the bear left with a bag of what he hoped was great food.

The concept of dumpster diving for data would entail the data management (i.e. data quality team) diving into data and looking for data that could possibly have one of these issues:

  1. A child record that has no parent. Probably because referential integrity is turned off and foreign keys are not checked when a new record is inserted. This seems to be a common practice in some data warehousing environment. So, the matching (parent to child) is the burden of the ETL tool. Ask yourself this: is all the ETL you have in your libraries correct? If not, then TAKE A DIVE into your data and see how you stack up.
  2. Are all the not null attributes really not null? The other day I found some quantities that had been set to 0 in some fact tables. Now unless the front-end excludes then for averages these would be counted. If your data could possibly have this issue, then TAKE A DIVE into the data with your profiling tool and see if this is something you want to monitor AUTOMATICALLY in the future.

Dumpster diving your own data may not be an option. BE THE BEAR!

Tags
Share

About Author

Joyce Norris-Montanari

President of DBTech Solutions, Inc

Joyce Norris-Montanari, CBIP-CDMP, is president of DBTech Solutions, Inc. Joyce advises clients on all aspects of architectural integration, business intelligence and data management. Joyce advises clients about technology, including tools like ETL, profiling, database, quality and metadata. Joyce speaks frequently at data warehouse conferences and is a contributor to several trade publications. She co-authored Data Warehousing and E-Business (Wiley & Sons) with William H. Inmon and others. Joyce has managed and implemented data integrations, data warehouses and operational data stores in industries like education, pharmaceutical, restaurants, telecommunications, government, health care, financial, oil and gas, insurance, research and development and retail. She can be reached at jmontanari@earthlink.net.

Leave A Reply

Back to Top