A way of paraphrasing what I suggested in my last post was that as master data practitioners, we are often focused too much on pulling data from source systems to populate a master entity model and not focused enough on understanding how dependencies across business processes may influence the proper synchronization of master entity data. This is often a byproduct of batch updating performed on a periodic basis with data extracted from transaction systems.
First, it might be worth having a short refresher on transaction semantics and synchronization. In most end-to-end business processes, there is a natural sequence associated with the steps necessary to ensure the proper execution and achieve the desired outcome. Let’s use an online purchase as an example. Once a visitor has placed items in an online shopping and decides to check out, there is a sequence of stages that might include:
- Requesting that the customer identify himself and provide proper authentication.
- Checking the provided information to determine if an account for that customer exists.
- If not, initiate a sequence to request that the customer create a new customer account. That will involve collecting information from the customer and creating a new customer account, then updating the transaction sequence with the newly-assigned account information.
- If there is an existing account, retrieve the account information and update the transaction sequence with the retrieved information.
- Request a payment method.
- Request an update to shipping location information.
- Update the customer account if new shipping information is provided.
- Calculate the final amount (including tax, shipping, and any other fees).
- Request verification from the customer.
- Commit the transaction.
- Transfer the information to the fulfillment process.
Note that these steps have to be performed in the right order; the transaction can’t be completed if there is insufficient shipping location information. In turn, the entire transaction can’t execute if the sequence is abandoned at some point along the way. That doesn’t mean, though, that aspects of the process aren’t completed. For example a new customer might create the new account but then decide not to finish the purchase. In turn, presuming that the last step is reached, there is a new sequence of stages for fulfillment that are initiated and carry their own internal dependencies.
These transactions are presumed to be executed in real time, or close to real time. But from the MDM perspective, the results of the transactions are all collected together at the end of the day and forwarded to the identity resolution and master registration activities and executed in batch. And, as I will discuss in the next post, therein lies the problem.