“You don't talk about data quality.”
No, wait—that's The First Rule of Poor Quality Data.
The First Law of Data Quality:
“Data is either being used or waiting to be used—or wasting storage and support.”
Although understanding your data is essential to using it effectively and improving its quality, as Thomas Redman explains, “it is a waste of effort to improve the quality of data no one ever uses.”
Therefore, investigate your data usage by asking the following six questions:
1. Where did the data originate?
Data is like Tribbles. The trouble with tribbles is that before you know what happened, you have way too much to handle. Every enterprise system seems to have more data than thought humanly possible (or even Vulcans could think possible)—and your data volumes are continuing to grow at alarming rates.
Your Kobayashi Maru of Data Usage begins with establishing data lineage. Did the data originate from an internal or external source? How many copies of the data exist? Is there a single system of record or a preferred source system for the data?
2. Why was the data received?
With external data, it is often easier to both identify the source and understand its intended purpose. For example, reference files purchased to either enrich or validate master data attributes.
With internal data, this can be more challenging. Data warehouses and master data management hubs might be staging all operational and legacy data sources for their subject areas, even though some of this data isn't actually being used.
For example, the reasons why financial transaction data was received is perhaps more obvious than other types of data. However, it's always important to determine exactly why you are receiving data.
3. When is the data applicable?
Similar to radioactive elements, all data has a limited shelf life. All data decays, but not necessarily at the same rate.
There are many different dates associated with data. Knowing accurate creation, update, effective, expiration, and other available dates can help estimate the timeframe that data will be applicable for its intended usage.
Just because storage has become less expensive, doesn't mean your organization should keep data forever. Knowing its shelf life can be used to indicate when data should be archived or possibly even deleted.
4. Who is the data describing?
As Peter Benson of the ECCMA explains, “data is intrinsically simple and can be divided into data that identifies and describes things, master data, and data that describes events, transaction data.”
Who are the “things” and “events” your master and transaction data are describing?
5. What does the data mean in business terms?
Business meaning is not entirely limited to the company's bottom line. However, the costs, risks, and revenue associated (directly or indirectly) with your data are the minimum requirements for this assessment. Additional aspects could include a matrix of business units and business processes associated with the data.
6. How can the data be used to make business decisions?
It can be argued that this is the fundamental question behind all data usage.
Ultimately, the success of an organization is measured by the results of its actions, which were based on its decisions, which were based on the information derived from its data. Therefore, the true purpose of data is to serve as a solid foundation for sound business decisions.
How data is being used is more important than the business processes that create it and the technical processes that manage it.
This is especially true when evaluating the potential ROI of data quality improvements. Ensuring the data being used to make critical business decisions is reliable and accurate is why data quality is so important to your organization.
Do you understand the where, why, when, who, what, and how of your data usage?