Soliciting information about enterprise reference data

0

The first step is establishing governance for reference data is assessing the existing reference data landscape: understanding what reference data sets are used, who is using them, and how they are being employed to support business processes. That suggests a three-pronged approach to identifying organizational business process and application dependencies on reference data domains.

Two of these prongs are empirical, involving analyses to find reference data domains (embodied via code value sets or enumerated inline inside programs) and then figuring out what they represent. The third prong involves engaging the business users to solicit their input.

  • Empirical data evidence: This might include profiling the data sets to find data elements that are populated with value sets that exhibit the characteristics of reference data, such as  frequently-used value sets with limited ranges or collections of values. After identifying these candidate values sets, review the data element metadata to assess whether the naming and any associated documentation or definitions indicate use of reference data.
  • Empirical code evidence: Reviewing application code for branching statements (such as “case statements” or “switch statements”) that enumerate code lists that represent reference data domains.
  • Engaging users and soliciting their feedback: This is probably the most useful, as it will expose the intended and expected uses of the reference domains and provides the data management professional with the opportunity to tease out specific definitions of the reference concepts.

Here is a basic outline of the initial process for soliciting and documenting metadata about reference data as the prelude to implementing governance:

  1. Survey business applications to identify points in which reference data domains are used.
  2. Provide a draft characterization of the reference domains that are used, including a proposed name, definition, conceptual domain, and list of values.
  3. Create an inventory of reference data concepts used as a repository for the proposed reference data metadata.
  4. Instantiate a framework for capturing that reference metadata.
  5. For each reference data set
    1. Provide a conceptual domain definition.
    2. List the value meanings for members of the conceptual domain
    3. Document the values used in the value domain.
    4. Provide a mapping of permissible values (value to value meaning).
    5. Validate with the business users.
    6. Commit an agreed-to version of the reference data metadata.

The objective of this process is to build the catalog as a foundation for further analysis. In upcoming posts we will look at the subsequent stages of this process.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Leave A Reply

Back to Top