Data quality initiatives challenge organizations because the discipline encompasses so many issues, approaches and tools. Across the board, there are four main activity areas – or pillars – that underlie any successful data quality initiative. Let’s look at what each pillar means, then consider the benefits SAS Data Management brings to each.
The people pillar
- Creating awareness of data quality through training, education and community building
People often create and maintain the organization’s data incorrectly. In many cases people are not aware of the data assets they work with, so they fail to preserve, maintain or share those assets. The people pillar encourages communication, education and collaboration among different individuals and business groups as a way to introduce major cultural change. New roles like data stewards and communities like data governance committees are key elements that drive people pillar activities.
Organizations that focus on the people pillar will benefit from quick results in areas where data quality is affected by human error or lack of understanding. But they must be willing to accept changes on both the stewardship and the people side. It usually takes a long time to evangelize and build complete cultural change. The process is easily slowed down by politics and relationships.
The process pillar
- Delivering better data by improving business processes
In enterprises that rely on automated business processes, poor data quality is often a symptom of poor business process design and execution. Almost all long-term business processes were designed to automate and speed up process execution – but they were not designed with data quality in mind. Properly redesigning business processes requires identifying the root cause of data quality issues. But it ultimately eliminates the real reason for incorrect data.
This approach of redesigning business processes is effective at driving data correction at the source and reducing the need to use costly IT resources later in the data life cycle. But not all data quality issues can be solved with business process redesign. At large organizations, simple process changes can be very expensive because they require a series of subsequent business processes to adapt to the redesigned process.
The governance pillar
- Improving data quality by defining standards, ownership and transparency
The governance pillar aims to coordinate many activities needed to meet the organization’s data quality requirements while enabling data sharing among users, systems and departments. The focus of governance is not only to create standards, ownership and transparency – governance should also align and moderate organization-wide data management activities. Failure to understand the significance of data governance negatively affects outcomes of the people and the process pillars. That’s because incompatible data can make automated data exchange effectively impossible – and meaningless data is of no use.
The governance pillar often establishes collaboration between IT and business functions. Data governance provides a framework that helps business users communicate their data needs and make it easier for IT to meet those needs. However, it requires significant discipline to define standards, ownership and transparency across an enterprise. This part of the data quality initiative can easily become the ivory tower, with no practical relevance for the organization.
The technology pillar
- Using tools and technology to improve data quality
To manage the rapid explosion of data, it has become mandatory to use specialized data quality technology. At the same time, the complexity and diversity of data quality issues combined with infrastructure requirements and dependencies call for a standard data management platform. Approaches that rely on ad hoc departmental tools will fail to provide enterprisewide consistency. Any standard data management platform used across an enterprise should include a range of tools and technologies, including:
- ETL.
- Data mining and analytics.
- Data quality, master data and reference data management.
- Profiling and metadata management.
Particularly for larger organizations, these tools and technologies can simplify tasks for business users who manage data quality activities. They’re also valuable when data speed or complexity demands automated data quality management. But data quality initiatives are not likely to succeed if they rely on technology alone. Improving and adopting new technology is necessary. But technology must be used in conjunction with the other pillars for organizations to achieve their data quality goals.
SAS Data Management: Supporting all four pillars
From my experience, only when organizations invest in all four pillars can they effectively address all the different types of data quality issues while reducing risk and effort.
SAS Data Management has rich, flexible capabilities that meet a wide variety of needs. It supports the data stewardship team, helps to establish a collaborative culture for better data, integrates with existing business processes and helps implement new ways of data remediation. Its capabilities encompass all four pillars – people, process, governance and technology.
The solution has a built-in glossary that allows data stewards, business and IT to collaboratively define and communicate new data quality standards. These agreed-upon data quality metrics can easily be turned into validation rules, and results of data quality monitoring can be visualized using an integrated dashboard solution. A glossary for data quality standards, metrics and monitoring then bridges the gap between IT and business – while forming the starting point for company-wide data governance.
For the process pillar, SAS Data Management enables organizations to embed data quality checks and corrective actions into existing business processes. That could entail doing data checks and corrections in real-time while entering the data, using a remediation application, or running automated, batch-type data quality tasks. The solution provides a full range of data quality features combined with flexible integration methods. And, if needed, organizations can add a master data management solution (SAS MDM) into the architecture.