They say that data governance is about people, process and organization. Much of the work required in planning for data governance is in defining people’s roles and responsibilities, and then designing the organizational structure that will provide authority for decisions to be made and enforced.
The processes, however, are not new. Many individuals (possibly in different business areas) are doing them. There is no formal data governance authority to standardize decisions at a broader level. In this blog series, I discuss the processes for:
- Investigating and isolating the data quality issues, aka-root cause analysis
- Starting to collect complete metadata definitions
- Performing data quality analysis.
Only when your governance group has worked through each step-in order-will you be more likely to design the appropriate data governance policies and data standards. Let's start with the first one.
Root Cause Analysis
The process of data governance is fundamentally very simple.
- Identify the data quality issues to address
- Prioritize the portfolio of issues to isolate/tackle the most important
- Perform root cause analysis to determine the true source of the data issue
- Design the corrective action
- Formalize the correction through consideration and approval by the data governance organization
- Implement the fix
- Monitor the results
It seems like when we start to map out the discrete steps involved in the data governance process, much of the work is already being done in informal ways throughout the organization. What some folks don’t realize is that data governance often formalizes a whole bunch of informal processes that either don’t get communicated, or aren't accepted as standard across business areas.
Root cause analysis is the process of identifying probable causes of a data issue and isolating the contributing factors. To resolve any particular issue, root cause analysis involves fact-finding, drilling into details of the problem, talking to the right people, and separating other associated (but not contributing) factors. A standard tool for supporting the detailed findings is the Ishikawa Diagram, also known as a fishbone diagram.
To conduct a thorough root cause analysis, use the following checklist:
- Diagnose the problem as if you are a physician or a detective. Consider all possible sources of the symptom. Don't rule anything out yet!
- Feel free to boil the ocean—be exhaustive and creative.
- Don't practice problem solving before collecting all possible causes.
- Draw your own fishbone diagram. Is it complete? Share it with the data users.
- Practice the “5 Why’s”—don’t stop asking “Why” until you have exhausted every conceivable potential reason.
- Rank the factors if possible. Identify the primary causes versus the secondary or associated factors.
- Rule out each possible factor one at a time. Justify why (you may need to come back to this later).
- Find all potential business process and data owners to involve them in your understanding of the possible sources of the problem.
- Share the findings with everyone involved in troubleshooting. They could rule out certain factors with their knowledge.
- Test your hypotheses with actual data.
- Fix the problem and test again.
- Publish/share your findings and fixes. Communicating your findings may reveal additional factors you hadn’t considered.
After a thorough root cause analysis has been completed, data stewards should proceed to metadata analysis and data quality analysis. These two techniques will be discussed in my next blogs.