In my last post, I introduced a number of questions that might be raised during a master data integration project, and I suggested that the underlying subtext of synchronization lay at the core of each of those issues. It is worth considering an example of an application to illustrate those points in the environment that are sensitive to synchronizing data among the various sources and process stages.
Let’s consider the use of a master customer index that links customer data among three sales channels: brick and mortar, telesales, and electronic commerce. In principal, the information about any individual that makes a purchase through any of these channels is expected to be available to all of these channels (as well as other business functions such as finance or customer service). And for the purposes of our discussion, let’s assume that a customer’s name, telephone number and street address is sufficient for unique identification. In addition, let’s presume we have a fully functional master index populated and in production.
Using these assumptions, consider this use case: A new customer buys a product online. At the point of sale, the individual is prompted for the identifying information that is relevant to unique identification, which must be provided prior to allowing the sale to continue.
Now what happens? That identifying information is submitted to the identity resolution engine to perform a lookup in the master index to see if this individual is already known as a customer. If so, the customer’s information is accessed from the repository and is used to complete the transaction. If not, the individual is a new customer, and that individual’s identifying information has to be added to the environment.
There are two approaches for adding the new customer: as part of a batch of new identities added on a periodic (daily) basis, or in real-time. But either approach poses some challenges. For example, if the customer’s information is added to a periodic batch, that customer remains “invisible” to the master environment until the completion of the next batch sequence. On the other hand, adding customers in real time will have a performance impact on the identity resolution engine, especially if the load is high, since the index may need to be globally updated to reflect merged identities or recognized relationships that are exposed as new customer data is added. We will examine this in more detail in my next post.