The intersection of data governance and analytics doesn’t seem to get discussed as often as its intersection with data management, where data governance provides the guiding principles and context-specific policies that frame the processes and procedures of data management. The reason for this is not, as some may want to believe, that governing analytics is akin to herding cats. Data governance does, in fact, intersect all types of analytics, from the descriptive analytics helping the organization understand what has happened and what is happening now, to the predictive analytics that determines the probability of what will happen next, and the prescriptive analytics that focuses on finding the best course of action for predicted future scenarios.
The common denominator of all analytics is data. Analytics is often referred to as fact-based decision making, and from my perspective facts are just another way of saying well-managed and well-governed data. But, as I have previously blogged, even well-governed data has a half-life and not all data can be governed the same way.
Sources will vary based on the type of analytics, which will impact the amount of data preparation, the time-consuming bane of analytics, that has to be performed. Descriptive analytics, which includes a lot of traditional business intelligence reporting, often draws from rigidly governed sources such as master data management and data warehousing, whereas predictive and prescriptive analytics often draw from more loosely governed sources, such as social media, open data and data streaming from Internet-connected sensors.
This is why one of the most important aspects of data governance for analytics are the policies that define, document and communicate the data preparation process. Data scientists might get all the fanfare at the top of the analytics food chain, but climbing the ladder from the dungeon of data collection and cleansing up through the workshop of data modeling and testing until you finally reach the tower of analytical insight requires a lot of people with different backgrounds and skills getting involved at various stages of data preparation. This includes those who understand the business problem analytics is trying to solve, data quality experts, data modelers, technical architects managing the analytics infrastructure, and, of course, those lauded data scientists applying their insight-generating statistical jujitsu.
Data governance policies for analytics provide an appropriate framework for this collaboration.