When someone mentions "governance," you can expect grimaces and groans to follow. But the goal of governance is not to slow down productivity or hinder creativity in data or model management. Properly implemented governance actually directs productivity and ensures creativity is focused and harnessed for the good of the organization. In this post, I'll discuss the primary aspects of data and model governance – and why they need to converge into a symbiotic discipline.

Data governance

Data governance provides the guiding principles and context-specific policies that frame the processes and procedures of data management. A principle communicates governance intent using crisp, plain-talk statements to specify the goal of a policy. A policy's business rules frame the context for the step-by-step rules of a data management procedure (which must be followed to comply with the policy). Together, these aspects of governance align everyone at all levels of the enterprise with the desired end-state of a policy’s implementation. They also provide standards and metrics for measuring policy compliance.

Key aspects of data governance include:

  • Policy monitoring – tracks governance compliance and traces when policies aren’t followed.
  • Data monitoring – notifies data stewards for remediation once data quality issues are identified.
  • Shared business glossary – manages business terms and hierarchies across lines of business. This flags changes in metadata for point-in-time snapshots that you can use during audits.
  • Data lineage – lets you view relationships and impact analysis across data models and other data elements. It also provides details across data transformation jobs and other data management processes.
  • Data security via role-based access – protects sensitive information, enforces who makes changes and controls what changes they can make.

Model governance

Model governance is similar to the relationship between data management and data governance. It provides the guiding principles and context-specific policies that frame the processes and procedures of model management.

Key aspects of model governance include:

  • Complete model life cycle management – works throughout business problem-statement creation, model development, validation, assessment, comparison, selection, deployment, monitoring, auditing and retirement processes.
  • Model risk management controls and guidelines – measure and address model risk at every stage of the life cycle.
  • Centralized model repository – enables an efficient, repeatable process for registering, version-controlling, modifying, validating, scoring, retraining and reporting.
  • Model lineage – builds an audit trail to track the libraries and metadata used to build models and every object linked to models. Lineage includes documentation, model execution code, references to origin sources and data.
  • Performance benchmarking – tracks model decay and identifies when you need to retrain or retire a model.

The convergence of data and model governance

In many organizations, data governance is a more mature – or at least a more formal – practice than model governance. Data, however, is a unifying thread. Data governance policies for analytics can serve as the foundation for developing a formal model governance program, or perhaps an expanded definition of data governance.

The bottom line is this: You need to trust your data and models and derive accurate business results from both. You'll need appropriate controls to provide transparency and to protect your data and model assets. To get everyone across the enterprise on the same page, data governance and model governance must converge into a symbiotic discipline. This will help align your data and model management efforts and ensure that they're accurate and worthwhile. The discipline behind symbiotic governance can also help you deploy, repurpose and secure your data and model management processes as efficiently as possible.

Read about ModelOps: How to operationalize the model life cycle
Share

About Author

Jim Harris

Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Related Posts

Leave A Reply

Back to Top