A few decades ago, when the first business intelligence solutions appeared, based on carefully designed data warehouses, real-time data capture was a distant dream. Reporting cycles were weekly or monthly at best, with data cleaned and loaded to match.
Fast forward to now, and the demand for rapid data-driven decision-making means that data needs to be in the hands of decision makers instantly if it is to be useful. Businesses are constantly evaluating opportunities and trying to stay ahead of their competition, and data is essential to this process.
Data warehousing, even with its traditional extract-transform-load process, has been able to move towards near real-time data handling and loading processes, via techniques such as micro-batching. But big data means there is just too much data to load.
Enter event stream processing.
Understanding event stream processing
Event stream processing is about handling and analyzing multiple-format data that arrives rapidly and in large quantities. The most obvious source is the Internet of Things (IoT), or the Industrial Internet, with its thousands of sensors collecting data every second. With more and more items connected, from cars, fridges, and phones through to industrial machinery, being able to process data in real time offers huge potential.
This may be easier to understand with some examples. In industry, event stream processing is being used to compare past and current performance of machinery, and so predict maintenance needs and allow scheduling, avoiding expensive breakdowns. This is particularly helpful in facilities such as oil platforms, where downtime is costly, but breakdowns can be catastrophic.
Event stream processing is also being used in healthcare, to monitor patients. Bringing together multiple sources of information rapidly means faster detection of any problems. In intensive care, where even a tiny change could signal a big problem, it means medical attention will be focused where and when it is needed.
But the technique does not just have big, life-threatening applications. It’s also being used effectively in online marketing and in fraud prevention, by tracking and comparing consumer behavior.
It’s not rocket science
The methods used in event stream processing are slightly different from the traditional extract-transform-load process. This extracts the data using queries, then turns it into the required form so it can be used. The event stream process uses continuous queries to stream the data on an ongoing basis. So, instead of loading it from one source to another, the data is analyzed continuously before being stored for further analysis later.
Situational intelligence is created by applying advanced algorithms and pre-defined rules to the huge masses of data coming in as streams. Typically, rules are developed using analytical models with proven methodologies and tested on historical data, before applying them to streaming data. Algorithms and rules typically create a notification or alert for the relevant stakeholders, but they may also feed back into the analysis to provide more insights (see Figure 1).
Complementing not substituting
Event stream analysis will not replace the need for data warehousing. It will always be important to store historical data, to support longer-term correlation or root cause analysis. But event stream processing can be used to support decisions that have to be made now, because the data has a short shelf-life.
For example, if someone is browsing online, your marketing needs to target them now, not in a week’s time. Event stream analytics can also be integrated with data warehousing, to discard data that will not be relevant later, reducing data storage costs.
In other words, event stream processing offers a new way for data management professionals to capture and analyze ever-increasing volumes of data to meet the ever-increasing demand from decision-makers for more accurate and timely data. Is it time you started thinking about it?
More about understanding data streams in IoT in this White Paper (PDF)
A more comprehensive version of this article was first published on Sytyke magazine on October 6th 2016.