In my last post we started looking at the issue of identifier proliferation, in which different business applications assigned their own unique identifiers to data representing the same entities. Even master data management (MDM) applications are not immune to this issue, particularly because of the inherent semantics associated with the assignment of the identifier to the class of entity to be managed.
I recently came across an interesting example working with a client’s MDM system. Basically, the system was intended to unify all data about customers. One complexity was that there were two kinds of customers: residential and business. Each residential customer was assigned a unique “residential customer” identifier, and each business customer was assigned a unique “business customer” identifier. In some cases the business consisted of a single sole proprietor running his/her own small business.
Later in the development process, the developers determined that they needed to keep track of relationships between individuals and businesses, such as employment, ownership or support relationships. Each of these individuals was assigned a unique “person” identifier and associated with its related “business customer” identifier. The problem was that some of the individuals assigned a “person” identifier already existed in the “residential customer” domain, and some already existed as sole proprietors in the “business customer domain.” This led to the existence of two, and sometimes three, master (and potentially more) records representing the same individual! The result was a situation where affiliated records (such as contact information, location information, etc.) for these individuals could not be managed through a single identifier, leading to (yet again) duplicated and inconsistent data.
The seemingly obvious issue is the need for additional identity resolution to link the different identifiers together. But the real issue is that we need to consider the right master domain models before building the system for identity resolution and assigning unique identifiers.
Creating a “business customer” or “residential customer” identifier locks the meaning to the identifier instead of the entity. The upshot is that subsequent analyses will not be able to provide insights into correlations about individuals who play multiple roles in multiple contexts. For example, you might want to know when certain individuals have ownership roles associated with multiple (different) businesses, or whether a sole proprietor is also acting as a residential customer.
Master data modeling is critical to reducing the complexity of identifier proliferation as well as ensuring the quality of analytics. In any case where entities exist in different contexts and are assigned different roles, reconsider how the model can capture the core information about each unique entity – as well as how that entity plays different roles in those contexts. The right model will enforce the uniqueness of entity data management and make it simpler to manage consistency across the enterprise.