Let us be smarter with the Internet of Things

As we enter the era of “everything connected,” we cannot forget that gathering data is not enough. We need to process that data to gain new knowledge and build our competitive advantage. The Internet of Things is not just a consumer thing – it also makes our businesses more intelligent.

Whenever we approach the idea of competing on analytics and building unconventional business strategies, we come up with one simple outcome – we need to be smarter in whatever we do. Looking at business models over time, they all get more complicated, more fuzzy. But what is constant is that every decision is based on past experiences and is driven by data and analytics.

Being smarter can have many faces. Let’s take a look at just three of them. Read More »

Post a Comment

Why data visualization matters

We've all met people a bit too enamored with reporting tchotchkes. I'm talking about folks who don't know the meaning of the word overkill. They don't understand that one can answer relatively simple questions (based on static data) without the aid of visuals. Examples include:

  • How have sales changed in the last quarter?
  • How much do our customers owe us?
  • How much do we owe our vendors?
  • How many employees did we hire last week?
  • What are our current inventory levels?

Read More »

Post a Comment

Data management for analysis – Feeding the analytical monster more than once

(Otherwise known as Truncate – Load – Analyze – Repeat!)

After you’ve prepared data for analysis and then analyzed it, how do you complete this process again?  And again? And again?

Most analytical applications are created to truncate the prior data, load new data for analysis, analyze it and repeat the process as required by analytics users.

Truncating the data in an application may be as easy as truncating a few tables, or it may entail a more sophisticated means of getting rid of data from the prior analysis (it depends on the software). The assumptions are that the analytics software has been installed and that the user is someone who knows the ins and outs of the tool. In any case, it’s highly recommended that this process be as automated as possible.

Read More »

Post a Comment

Event stream processing and/or real-time processing

Event stream processing (ESP) and real-time processing (RTP) so often come up in the same conversation that it begs the question if they are one and the same. The short answer is yes and/or no. But since I don’t need the other kind of ESP to know that you won’t find that answer helpful, permit me to respond to your query by streaming a few more, hopefully eventful, words.

RTP typically refers to a system or application that is time-sensitive, which means its output is guaranteed within a real-time deadline. For example, a trading system that guarantees the query of a stock’s current price will be returned within 10 milliseconds. Although the stock price is likely fluctuating, this system doesn’t execute the query until requested. But it does return the result quickly enough to represent the current stock price.

ESP, on the other hand, typically refers to continuously querying data in motion (i.e., event streams) as it flows through a system or application. There are usually no compulsory time limitations in ESP. Instead the goal is to detect meaningful patterns within event streams and determine if an individual event within the stream should trigger an immediate action. For example, a fraud monitoring system that continuously queries credit card transaction flows to detect patterns of unusual activity and stop a fraudulent transaction before it’s executed. Read More »

Post a Comment

Event stream processing: Think "rapid"

What sends a data management product to the top of the “hot” list? In a word – speed. Especially when that speed can gracefully accommodate the huge world of streaming data from the Internet of Things.

One of SAS’ hottest (and recently enhanced) products, SAS Event Stream Processing is an in-memory technology designed for speed. Its combination of high throughput (millions of events per second) and low latency beats out every other SAS product you can name. What’s more, this version of the software adds SAS Event Stream Processing Studio and Streamviewer , making it easier than ever to design, test and improve your projects.

Let’s use the word R-A-P-I-D to spell out how SAS Event Stream Processing delivers faster-than-ever insights to what’s happening right now.

Read More »

Post a Comment

Social media: The case for event stream data

As a consultant in the late 1990s until the late 2000s, I wrote thousands of reports for my clients. Of course, in many cases, a query or sub-report ultimately served to answer a much larger question. While their objectives and means (read: reporting applications) varied tremendously, each was designed to answer very specific historical questions.

Read More »

Post a Comment

Holistic analysis and event stream processing

Over the past year and a half, there has been a subtle shift in media attention from big data analytics to what is referred to as the Internet of Things, or IoT for short. The shift in focus is not intended to diminish the value of big data platforms and analytics, but rather to emphasize its place in the emerging future vision.

The key point has to do with the motivating factors for big data analytics, which we could conveniently differentiate into two categories: operational analytics, and what I might call “strategic” analytics. Operational big data analytics techniques absorb and analyze massive amounts of data to look for immediate benefits within operational applications. These could include overcoming impediments to timely deliveries in the supply chain in real time, or continuously monitoring manufacturing systems to adjust production to market demand. Strategic analytics examines massive amounts of data to identify facets of longer-term opportunities such as refined customer segmentation or retail location placement. Read More »

Post a Comment

Integration and publication: Data management for analytics

Once you have assessed the types of reporting and analytics projects and activities are to be done by the community of data analysts and consumers and have assessed their business needs and requirements for performance, you can then evaluate – with confidence – how different platforms and tools can be combined to satisfy the end-to-end data management demands. This is particularly useful for ingestion and provision.

Ingestion is composed of the tools and processes for data acquisition and persistence for at least three types of data sets: bulk data integration, interactive integration, and streaming data. For each of these data set categories, that means one or more of the following processes have to be supported:

  • Loading data onto the platform.
  • Profiling and validating the data to assess compliance with data usability expectations.
  • Applying transformations needed for data organization and alignment.
  • Storing the data.

Read More »

Post a Comment

In data governance’s service: Data virtualization, part 2

Data governance and data virtualization can become powerful allies. The word governance is not be understood here as a law but more as a support and vision for business analytics application. Our governance processes must become agile the same way our business is transforming. Data virtualization, being a very versatile tool, can give a fast track to gaining that flexibility.

Having discussed the way that data virtualization can support data sources management, lets dig in to another implementation scenario. Read More »

Post a Comment

Analyzing the data lake

In my previous post I used junk drawers as an example of the downside of including more data in our analytics just in case it helps us discover more insights only to end up with more flotsam than findings. In this post I want to float some thoughts about a two-word concept that is becoming almost as prevalent as big data and sounds scarily close to dumping all enterprise data into a junk drawer. That concept is the data lake, which was the topic of the great Data Lake Debate blog series between Tamara Dull and Anne Buff, moderated by Jill Dyché.

Let’s start with a definition. According to Dull, a data lake is a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. Any and all data, therefore, can be captured and stored in a data lake, and data structure and business requirements do not have to be defined until the data is needed. In theory, a data lake enables the enterprise to store, process, and analyze all of its data, allowing the business to ask more questions and get better answers.

As Dull described it, data management has evolved from wanting to store and process any and all data, but not being able to due to costs and technology limitations, to the era of big data where technologies like Hadoop are cost-effectively enabling it, but leaving us questioning if we should. The central question is whether collecting and storing data without a pre-defined business purpose is a good idea. Read More »

Post a Comment