The First Law of Data Quality explained the importance of understanding your data usage, which is essential to the proper preparation required before launching your data quality initiative.
The Second Law of Data Quality explained the need for maintaining your data quality inertia, which means a successful data quality initiative requires a program – and not a one-time project.
The Third Law of Data Quality explained a fundamental root case of data defects is assuming data quality is someone else’s responsibility, which is why data quality is everyone’s responsibility.
The Fourth Law of Data Quality explained that data quality standards must include establishing standards for objective data quality and subjective information quality.
The Fifth Law of Data Quality explained that a solid data quality foundation enables all enterprise information initiatives to deliver data-driven solutions to business problems.
The Sixth Law of Data Quality explained that data quality metrics must meaningfully represent tangible business relevance in order for high-quality data to be viewed as a corporate asset.
The Seventh Law of Data Quality
Whether you define data quality as real-world alignment or fitness for the purpose of use, once you identify a data quality issue the next logical step would seem to be taking corrective action. Whether you define that step as defect prevention or data cleansing, two questions are commonly asked:
- Should every data quality issue be corrected?
- Which data quality issue should be corrected first?
The first question often leads to philosophical debates about data perfection. The second question is more practical, since no matter what your objective is, you have to get started somewhere. However, both of these questions share an all too commonly unasked prerequisite question.
For example, let’s imagine I’m a customer of a financial institution from which I have a credit card that’s still active since I continue to pay the annual fees though I have no outstanding balance and I haven’t used the card in more than two years. Then I make a $50 online purchase with the credit card, but when I receive my bill via postal mail I notice my account was charged $500. So I contact customer service, and during the process of investigating the issue it’s discovered that although my billing name and postal address are accurate, my contact phone number and email address are out of date.
This example has three data quality issues. Should all three be corrected? Which one first?
I doubt that anyone would argue that the incorrect transaction amount is most important to both me and the credit card company. But what about the two outdated master data attributes? I doubt that anyone would argue that billing attributes are more important than contact attributes. Since my billing attributes are accurate, neither I nor the credit company noticed or cared about my inaccurate contact attributes, which likely existed long before my customer service call brought attention to them.
My point is not all data quality issues are created equal. Business impact must be used to prioritize data quality issues. This doesn’t mean that lower priority data quality issues should necessarily be ignored. It means that data quality issues that have a higher business impact should be corrected first, and also that some data quality issues may have such a low priority that they are never corrected.
Therefore, the Seventh Law of Data Quality is:
“Determine the business impact of data quality issues BEFORE taking any corrective action in order to properly prioritize data quality improvement efforts.”