Big data, by which most people mean Big Volume, doesn’t get you very far just by itself, but with the addition of Big Variety and analytics, now you’re talking. In fact, most organizations who are making headway into capitalizing on their data assets now refer to the process as "big data analytics" – a combination of data storage and management, data integration, and analytic tools and techniques.
I have previously made the case that the value in big data stems primarily from its Big Variety, and now I want to put that variety into its proper context of related data volumes and analytics for insights.
The potential for getting this combination of Volume + Variety + Analytics right can perhaps be best illustrated by the findings of a U.S. Department of Defense review committee on the September 11, 2001 attacks. What the committee found was that essentially all 19 of the hijackers could have been linked to each other and to the pending attacks via just seven properly targeted mouse clicks through existing public/government databases:
- Two of the hijackers who made reservations on American flight 77 were on the CIA terrorist watch list.
- Matching the above addresses provided to the airlines would have revealed three more hijackers sharing those addresses, all with flight reservations for the morning of Sept 11.
- The call-back phone numbers supplied to the airlines by the three above would have uncovered five more hijackers using those same phone numbers, again, with reservations for 9/11 flights.
- One more hijacker used the same frequent flyer number as one of the five above.
- Two more hijackers used the same address as one of the five.
- A search of the Expired Visa / Denied Entry list from the database of the US Immigration and Naturalization Services would have exposed one more hijacker on a September 11 morning flight.
- Lastly, this Expired Visa hijacker shared the same address with the remaining five hijackers, who, when checked against flight data, would have been found to also have reservations for that fateful morning.
The outcome of this committee report would be the catalyst for a complete overhaul of how the US government would acquire, manage, integrate, share and use intelligence data in the future. Effectively combatting terrorism would require creating a data environment that allows for those seven clicks to be made, where the clicks represent the analysis done on the Big Volume (i.e. flight reservations) of data resident on multiple, disparate Big Variety data silos.
Another way to look at the big data analytics process is to see how big data can be used to ‘paint a picture’ of your target in the style of the Pointillists from the turn of the 19th Century, the most well-known of these painters probably being George Seurat.
The graphic above shows some detail from his Parade de Cirque – 1889. The figure consists of just seven or eight distinct colors, corresponding to our Big Variety data sources, applied as individual dots of unblended color. Applied one-by-one to the canvas, a picture would slowly start to emerge. But a picture of what? Even with all of the three most common colors in place, the green, orange and blue, the big volume colors, you'd be hard pressed to guess the ultimate subject matter. But with the addition of just a few more key darker colors, the black, red and dark blue, the emerging man’s head would become unmistakable.
Painting a picture with big data analytics works much the same way - plotting points of data on the virtual analytic canvas until a recognizable pattern begins to emerge.
Whether it's analyzing customer data for churn and retention, production data for quality, patient data for outcomes, sensor data for impending equipment failure, credit card transactions for fraud, or criminal justice data for possible terrorist activity, big volume is just the starting point. To get to the actionable insights you want, to paint a clear, intelligible picture of your target objective, you need to combine that volume with the data integration of big variety, and then apply the analytical tools and techniques that help you identify the hidden patterns and trends.
Data without the analytics is, well, just data. Analytics without the volume and variety means leaving insights on the table. It takes all three. It’s a process, it’s a discipline, it’s a culture – it’s the big data analytics approach to creating business value.