I sometimes refer to reference data as a “celebrity orphan” within an organization because reference data sets are touched by many business processes and applications, yet remain largely unowned and unmanaged. Few organizations have a truly formal methods for management and authority for reference data. This poses a conundrum: a widely used conceptual data asset is generally left to reimplementation whenever someone decides there is a need for a copy.
That introduces a number of different challenges for ensuring quality and consistent use of reference data, including:
- Reinterpretation: the same reference data set concepts may be redefined multiple times by individuals with different thought processes and modeling concepts, depending on specific immediate demands.
- Control: each consuming business function may not feel constrained in changing the value set since there is no perception of sharing involved.
- Asynchrony: each reference data consumer may choose to create his or her own replica and version of the values in the reference data set, and each of these replicas will remain unsynchronized from the perspective of data currency.
- Inconsistency: depending on the sources and time of access, different copies of reference data sets may have subtle or even obvious differences in their set of values.
- Hierarchy confusion: not only are there issues with determining the valid sets of values, the organization of those values may be corrupt if the classification hierarchies are not managed.
- Hard-coding: certain reference data sets may be hard-coded and embedded within case or switch statements in dependent applications and business contexts.
- Authority: while the number of replicas may multiply, there may not be anyone designated with the authority to manage the reference data set.
These are all symptoms of the absence of coordinated oversight for what should be a shared resource. Introducing governance (from the policy perspective) and management (from the operational perspective) will enable better coordination among the enterprise consumers of reference data.
What do you think? Share your thoughts below.