Does it seem like almost everything is a “big data” problem right now? And nearly every vendor is offering big data or big analytics solutions? Is big analytics more important than big data? And what is the difference? I've encountered this confusion in the market a lot over the last year as I’ve traveled the globe talking to business and government leaders about big data.
In the process of explaining the market to others, I've come up with a clearer way to understand the landscape. This explanation has helped a lot of businesses understand what type of analytic problems they actually have, and sometimes it helps them see that their problems are more of the big analytics variety instead of the standard issue of big data alone.
Sometimes, for example, you don’t have that much data but it’s still taking you five hours to run a marketing optimization job because of the number of possible offers. There really aren’t a lot of records but you have to do multiple passes on the data, running complex algorithms with each step. That’s a big analytics problem and not just a big data problem.
Let’s dig into those differences a bit further.
Our first step is to revisit the distinction we’ve made over the years between reactive and proactive analytics. Standard business reports, ad hoc reports, OLAP and even alerts and notifications based on analytics are in the reactive category. Now, reactive analytics can still be very useful. They’re required for a lot of finance and regulatory reporting, and they help business users perform ad hoc analysis every day, but they are ultimately informing you about the past.
Proactive analytics like optimization, predictive modeling, forecasting and statistical analysis, however, are forward looking. They allow you to identify trends, spot weaknesses or determine conditions for making decisions about the future. They include optimization of complex problems with many dependencies, predictive modeling, regression analysis and other advanced methods for proactive decision making.
The next thing we need to define is big data. Put simply, when you have exceeded the capacity of conventional database systems, you’re dealing with big data. Before that, it’s what I like to call “growing data." It is still a large amount of data but it hasn’t hit the limitations seen with big data.
Today, we can store lots and lots of data but processing times have become excessive because traditional storage environments are not conducive for proactive analytics. When you have reached a point where processing times become unacceptable, you may be dealing with big data sizes but you may also be dealing with big analytics.
To better understand the difference, let’s create a chart with reactive and proactive analytics on the Y axis and the size of the data on the X axis, like this:
Now we can see the four major types software solutions available in the analytics market today. They are:
Business Intelligence (BI). If you are dealing with a large amount of data and providing reporting capabilities for end users so they can gain access to information, summarize data and even drill down into that data themselves, you are dealing with business intelligence applications. These solutions provide a strong look at various performance aspects of the company that occurred in past. That is BI. That is the lower left quadrant in Figure 2.
Big data BI. Now, when data gets bigger and you’re dealing with outside data sources or – as more companies are starting to see – you’re pulling in unstructured data, your problems are getting bigger. It’s taking users too long to get the information they need, or you’re having a hard time combining data sources fast enough to provide reports like you used to and you need technology that allows quick access to data – but you’re still providing reactive analytics. This is the most common big data scenario in the market right now, and most businesses are trying to solve this with SQL based solutions. That is big data BI. It is in the lower right quadrant of Figure 2.
Big analytics. As I mentioned before, it takes a different kind of analytics to support forward looking decisions. If you’re looking at customer preferences, markdown optimizations or fraud predictions, you need a different type of architecture. These problems typically involve growing data sizes and proactive analytics. Instead of the data size slowing you down, it’s the fact that you’re making multiple passes on data that may take hours and hours to get results, and you’re running advanced analytic calculations that take longer to process. Today, you need those answers in seconds or minutes. This is big analytics. It is located in the upper left quadrant of Figure 1.
Big data analytics. Now, what about organizations that have a whole lot of data and are dealing with proactive decision making? Here, we’re talking about hundreds of millions of SKUs across multiple retail stores. We’re looking at future sources of data too like telematics data in the auto industry, which can be useful for manufacturers and insurers. These are the types of problems most businesses really haven’t dealt with in past. And these aren’t small data problems. You don’t want to summarize that information. Manufacturers want to be predict safety problems before they impact customers and insurance companies want to adjust rate plans for the best drivers, for example. This is big data analytics. You’ll find it in the upper right corner of Figure 2.
My point here is not to say that one is better than the other, but they each do different things and they each require different architectures. As you look at what’s going on the market and in your business, understand the difference between each of these four areas and how the different problems can be solved.
Analytics continues to be a broad term in market but it’s worthwhile to look at the problems you are trying to solve and determine where you fall in this landscape. It will help determine what your next steps are in your big data journey.
I’ll be presenting these concepts in more detail later this week at The Premier Business Leadership Series. If you’re attending, stop by after the presentation and let me know if this is a useful breakdown for you. I’d love to hear your thoughts.