Data modeling for data policy management

1

Operationalizing data governance means putting processes and tools in place for defining, enforcing and reporting on compliance with data quality and validation standards. There is a life cycle associated with a data policy, which is typically motivated by an externally mandated business policy or expectation, such as regulatory compliance.

A data policy is effectively a contract that specifies what the expectations are for compliance with a set of business/data quality rules, the specification of levels of acceptability for compliance with those rules, the process of notification when a noncompliance is detected, and the escalation strategy when identified issues are not adequately addressed.

There are three operational aspects to defining and managing data policies. There is a “user-facing” facet that must represent all of the qualitative information associated with a data policy. There is a “life cycle” facet that must capture the status of policies as they are defined, reviewed, approved and moved into production. And there is what we might call an “automation” facet that would feed the automated validation of data and notifications when data does not comply with users’ defined expectations.

In essence, this defines the starting criteria for a data model for representing data policies. There is a structural component for capturing the text associated with the policy definition, including:

  • Source business policy – The authority under which the data policy is being defined. To use the example from my prior post, a device manufacturer’s policy for documenting data lineage might refer to the section of 21 CFR Part 11 that specifies that each step of a computerized system must be identified and documented.
  • Description of the policy – This provides the high-level business details of what the policy is intended to enforced.
  • Roles and responsibilities – This names the roles that participate in the policy and what the responsibilities are for each role.

There must be data elements to capture the state of the approval process, including:

  • Author – The name of the individual who drafted the policy.
  • Reviewers – The names of the people who are reviewing the policy.
  • Approvers – The names of the people who must approve the policy.
  • Current status – The stages through which the policy has progressed (initial draft, initial review, approved, etc.).

And there must be data elements that capture the automated operationalization:

  • Measures – One or more business rules defined in a way that can be consumed by a data validation service for automated implementation.
  • Levels of acceptability – The percentage of the pool of data being tested that must comply with each rule, along with the cadence in which the data sets are tested.
  • The metrics – Weights associated with each of the measures that are aggregated, if requested.
  • Data stewards – The names of the data stewards to be notified when a violation occurs.
  • Escalation chain – The sequence of individuals to be notified if an issue is not resolved.

1412607852717The business rules would be accessed by an automation tool (such as a data profiling product) that would assess the level of quality and provide a report. A monitoring service can review the results and compare them to the levels of acceptability, and generate notifications when the levels of acceptability are missed. An incident reporting and tracking system can use the roles, responsibilities, notifications and escalation chain data to supplement its use for tracking the status of incidents.

These are just starting points. Hopefully this will motivate more in-depth discussions of what would need to be designed into a data model for managing data policies.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Related Posts

1 Comment

  1. Richard Ordiwich on

    The challenge with automating business rules is that business rules are written to be interpreted by humans not computers. Business rules governing data quality are subjective and context specific. Even data privacy rules are difficult to automate when all the various factors such as context, norms, and roles are considered.

    These same business rules are also temporal. They change with the times, norms and people. Many government dispatches contain interpretations of legislation that governs data. Keeping an automated system current with all the changes would be a significant undertaking.

    Decoding a policy or business rule into an unambiguous set of program code is itself a difficult task. Business rules and legislation are not written to be automated. Terms are not well defined, ambiguity is frequently evident and sometimes purposefull.

Leave A Reply

Back to Top