To a great extent, the data manipulation layer of our multi-tiered master data services mimics the capabilities of the application services discussed in the previous posting. However, the value of segregating the actual data manipulation from the application-facing API is that the latter can be developed within the consuming application’s code base as a wrapper over its own data subsystem. That provides a façade for the application that guarantees consistent results, while the API targets its own data layer – and yet provides a “dual-tracked” means of selectively transitioning to the master services.
For example, an application might use the search mechanism to see if an entity record exists. At first the application can do a lookup in its own data subsystem, then switch to use the master entity lookup, which may provide better identity resolution.
The database management layer of the service stack can include any or all of these types of services:
- Providing probabilistic matching for identity resolution;
- Identity search (potentially blending the probabilistic matching with key-based lookups, enhanced using similarity scoring methods);
- Creation of new entity records;
- Registration of newly created entity records within the master index;
- Accessing the master index to look up unique identifiers;
- Updating the master index;
- Accessing master entity records;
- Updating master entity records;
- Deactivating master entity records;
- Find all relationships associated with an entity;
- Relate two or more entities and associate nature of relationship;
- Break/deactivate a relationship;
- Merge two records when they can be determined to represent the same entity; and
- Split a record into multiple records when it can be determined that there is a false positive.
These types of services encompass five key areas: matching, record life cycle, identification and assignment of identifier, relationship management, and governance. Yet instead of being facades, as in the application layer, these services target the master data sets themselves. However, that does not necessarily mean that they go directly to the source, and in my next post we will look at some specifics of the data access layer.