To stream or not to stream?

1

man contemplating: to stream or not to streamHadoop may have been the buzzword for the last few years, but streaming seems to be what everyone is talking about these days. Hadoop deals primarily with big data in stationary and batch-based analytics. But modern streaming technologies are aimed at the opposite spectrum, dealing with data in motion and providing analytical insights in flight.

Streaming technologies have been around for a number of years. But recently, the numbers and types of use cases that could take advantage of these technologies has exploded. Today, the question is not really about whether or not to stream. It’s about how to marry new streaming capabilities and approaches with emerging use cases.

Streaming use cases and applications

Streaming applications enable continuous processing on data that’s being produced constantly. This equates to a huge number of streaming events that need to be filtered and analysed for insights in a short period of time. The degree of real time that’s needed – and the latency demanded by an application – define which use cases can benefit the most from real-time streaming analytics.

Cybersecurity and IoT, for example, can benefit from streaming technology. Network-based cybersecurity solutions deal with streaming data from routers and network equipment, while IoT applications deal with sensor or device data. In both cases, the streaming solution has to deal with huge volumes of data at very high rates (e.g., millions per second) with extremely low latency (in milliseconds). Typically, there’s a large signal-to-noise ratio that has to be dealt with to deliver valuable insights. This is where the streaming platform becomes most valuable.

From a traditional business perspective, there’s also a growing need to monitor customer behaviours and transactions in real time and then act on them using data and analytics. Streaming solutions now play an important role in powering these highly targeted, push-oriented services and offerings for different industries.

Open source, or commercial off-the-shelf?

The technology behind streaming platforms has matured significantly to tackle traditional and emerging use cases. From a tool selection perspective, customers often ask me how to compare commercial streaming software from SAS with those from the open source community. At a high level, the choice extends into streaming platform discussions.

Commercial solutions like SAS Event Stream Processing offer a more robust development and deployment framework that saves time and cost. Customers often look to SAS for a proven, scalable platform (frequently the main consideration). SAS Event Stream Processing can also accelerate development and ease deployment efforts through native, optimised integration with legacy systems like Teradata, IBM WebSphere MQ Series and next-generation big data platforms like Hadoop and Apache Nifi.

SAS Event Stream Processing has an interactive, visual development environment.
SAS Event Stream Processing has an interactive, visual development environment.

A key benefit of open source solutions, of course, is their openness or extensibility. Fortunately, SAS Event Stream Processing was developed with that in mind. It can be extended by using third-party codes and APIs. This means developers can potentially use C++/C, Python or Java to build new extensions and adaptors and design the embedded streaming logic.

In the end, you shouldn’t choose a streaming engine based solely on whether it’s open source technology or commercial software. Instead, you should base your decision on functionality.

Streaming analytics

Not all streaming engines are created equal. Your choice of streaming platform will certainly affect what you can do and the insights you can get. One of the unique aspects of the SAS Event Stream Processing engine is that it was built with analytics at its core from day one. It excels in traditional requirements like data aggregation and filtering within stateful or stateless windows. But it’s the ability to deploy analytical models within the engine that makes it an incredibly powerful platform for driving real-time analytical insight.

SAS Event Stream Processing supports dynamic clustering-based machine learning using k-means clustering to identify homogeneous segments in live event stream data. Built-in text mining capabilities let you analyse and score unstructured text on the fly. This opens the possibility of doing sophisticated sentiment and contextual analysis against streaming social media content and other unstructured content.

At SAS, we believe that streaming analytics should be a foundational pillar for every analytical deployment architecture and that machine learning algorithms will be core to the success of future streaming solutions.


Download a white paper to learn more: Channeling Streaming Data for Competitive Advantage

Share

About Author

Felix Liao

Domain Leads Manager – Analytics Platform

Felix Liao is a manager within the customer advisory team at SAS and is also responsible for the analytics platform product portfolio for SAS Australia and New Zealand. He has over 15 years of experience working in the Australian and New Zealand analytics market. Felix was responsible for the regional launch of SAS Viya and was also responsible for the successful launch of SAS Visual Analytics in Australia and New Zealand in 2012. He is a regular speaker and blogger on the topic of analytics, data visualization, and machine learning. A computer engineer from his undergraduate study, Felix obtained his MBA in 2009 from Macquarie University, and he is also a SAS certified data scientist.

1 Comment

  1. Very nice blog on ESP Felix. A major focus of SAS ESP now is bringing SAS analytics into the stream, with a focus on learning models, which is the main reason we came to SAS to build ESP. You are right that ESP has been around for a while. We built the first commercial ESP back in Bell Labs Research & Lucent 20+ years ago, and used it for signature-based fraud prevention, real-time rating, and prepaid call authorization. ESP startups (including Aleri, my last company) started showing up ~15 years ago, and most of us were focused on algorithmic trading, position management, and risk management in capital markets. Now with the growth of IOT, we are seeing a need for streaming analytics everywhere, and we are bringing it out to the edge.

Leave A Reply

Back to Top