Last time we looked at some of the benefits to be derived form managing the metadata associated with reference data. Practically speaking, the first step in managing the metadata for reference data is to identify which reference data domains exist and document how those domains are used. That suggests taking inventory of the reference domains and creating a catalog of the shared reference data resources documenting how each is used.
Note that this process can be a prelude to introducing governance for the reference domains. Consider the types or data policies that are relevant to shared reference data since the inventory helps establish proper procedures that can be put into practice. That being said, the objectives of this inventory process include:
- Identification – Determining where a reference data domain is used and providing high-level details such as conceptual domain, name, and definition.
- Enumeration – Providing a list of value meanings for an identified reference domain, as well as mapping standard values to conceptual values for each conceptual domain.
- Semantics – Documenting the semantics associated with the referred-to each conceptual domain.
- Mapping – Establishing a correspondence between the used reference data sets and catalogued conceptual domains.
- Lineage – Documenting the lineage characteristics (which business processes, systems, applications, databases, etc. use the reference data set with specific semantics).
- Harmonization – Determining when two reference sets refer to the same conceptual domain and harmonizing the contents into a conformed standard domain.
Fortunately, because this process can precede the initiation of governance, it allows for the identification of candidate custodians and stewards who would be accountable and responsible for oversight of enterprise reference data. Engaging those candidates and getting them involved in the inventory and cataloguing process creates alignment between the artifacts (the reference data sets) and their governance.