Analyzing sequences of events

1

How do you get your hands around the determination of a relevant sequence among a massive number of events that precede some specific outcome? More to the point: if we want to be able to anticipate some outcome as a result of a set of events, how can we figure out what the most probable events are and the sequence in which they appear?

The best way to approach this challenge is to compare it to an analysis that we might be more familiar with: market basket analysis. One goal of market basket analysis is to determine what sets of items are purchased together as a way of facilitating improved product bundling, packaging and placement to drive product up-selling and cross-selling. The analysis considers the collections of items that all individuals purchase at a single time and looks for any collections (or better yet, subsets) of items that appear together with some degree of frequency. In turn, recommendation engines can take the results of this analysis to suggest product cross-sales. One example is manifested in online sales, in which the user is advised that “people who purchased this item often purchased these other items” or are presented with special pricing for selecting a bundle of items instead of just a single item.

Event sequences are very similar: we are basically looking for the collection of events that are in each customer’s “event basket” prior to the specific outcome, and then looking for those events that appear together in the “event basket” most frequently. This challenge is a little more difficult, though, for one reason: the market basket can be analyzed as a single unit at the point of checkout, while the “event basket” may be affected by a number of additional variables, such as:

  • The number of events that precede the outcome
  • The time duration over which the events take place
  • The order of the events
  • The existence of irrelevant events that do not contribute to the outcome

These additional complexities notwithstanding, the market basket analytics algorithms can be adapted to scan the numerous sets of events and come up with some number of sequence patterns that, even if not perfect, still provide some ability to anticipate the outcome and take some action to either encourage it (if it is a positive one) or prevent it (in the negative case). It is just a matter of monitoring those millions of events to find the specific pattern that would precipitate a notification. More on this next time…

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

1 Comment

  1. Great points, David. In the B2B world, would you say that the concept of market basket analysis can be applied to predicting what it takes for a person or organization to make a purchase? For instance, I'd like to know if companies are analyzing event attendance, web-site visits, white paper downloads, etc., in these ways to make this determination.

Leave A Reply

Back to Top