You have to be able to trust the data that you are working with, whether it’s data processing or analysis that you are involved with. And there is a strong correlation between that trust and data quality. Is it possible to determine data quality without monitoring mechanisms? A platform for defining and monitoring data is a must-have for practically any organization – and especially for a data-driven-organization. Before data quality improvement software can do its job, data quality monitoring indicators and mechanisms have to be defined, managed and implemented. One can even venture to say that this should take place before any initiative associated with data-driven decision making.
SAS Data Management solutions provide comprehensive support not only for data integration processes, but also those associated with data quality monitoring and improvement. SAS’s mature data quality monitoring methodology involves 6 steps which have to be taken in order to implement such an environment.
Figure 1. Data quality monitoring methodology
Step 1. Defining
Before tackling the data quality issue, one has to answer the question: "What will I need the data for?”. The answer, or the actual need, will be crucial in determining the starting point for our initiative. We do not have to start with indicators for all systems and all data. Just begin with a single business initiative and use it to define what data will be required and what that data will look like. During this process, it would be useful to define the following:
- the business owner – or a person (or persons) who will need the data to perform business goals;
- data expert – or a person (or persons) who will assist with identifying appropriate systems for the performance of the business owner’s goals, and who will support the business owner in the defining process.
Once the project goal is clearly identified and we know what data will be required and where it will be used, we then have to define and document that data. The SAS Business Data Network will come in handy here.
Figure 2. The SAS Business Data Network application
SAS Business Data Network enables us to define business terms and establish relations between them. In our case, these could include descriptions and definitions of data quality indicators and ratings (integrity, correctness, accuracy, availability, etc.), system descriptions together with data owners as well as field types, definitions thereof and verification methods. Apart from a single data entry point, the tool itself also provides us with control, versioning and monitoring mechanisms. Adding, deleting or modifying terms in the tool may be based on a workflow created by users. Users have an option to preview the state of the repository itself at any given point in time as modifications are subject to versioning. All modifications may be automatically sent to subscribers and displayed following www login to the application.
A mechanism for connecting the business world with the world of technology is another useful element. When, in subsequent steps, users begin implementing monitoring or data quality improvement rules, the SAS Business Data Network will provide an option to attach technical processes to business terms. This means that coming from a business rule or a given system, users will be able to perform complex analyses. Starting from a data quality monitoring rule, it will be possible to find out which processes implement a given rule, where and who they were created by. Whereas starting an analysis from a system, we will be able to obtain information on what rules are implemented therein, which processes make use of the given system and what reports were created using its data.
Use of SAS Business Data Network class tools means that once the necessary information is identified and entered, it will be possible to smoothly progress to subsequent steps such as profiling or implementing rules. Users taking part in the initiative will have all the data in one place, which will be accessible via a web browser from any location within the organization.