Like it or not, "Big Data" is getting a lot of coverage both in traditional media and social media chatters. The traditional IT Analyst firms are selling reports about it. The Economist is calling it "The Data Deluge". Heck, even McKinsey is onto it, and you know it's important if McKinsey is talking about it!
After reading through all the various articles, one thing that I think can be agreed is that there are no agreed definitions for "Big Data". There are however, common themes when it comes to the characteristics and implication of big data to organisations, especially in the context of trying to leverage this big data to drive insights through analytics.
- VOLUME: This is an obvious point and one everyone thinks about when relating to the word BIG. Regardless of what area you look at, there are certainly no shortage of quotes and metrics to highlight the issue at hand. A recent study by McKinsey estimates that organisations across all sectors in the US have at least 100 terabytes of data, many having more than 1 petabyte. The more scary thing is that many predict this number to double every 6 months going forward.
- VARIETY: Both internally and externally, no longer are organisations only dealing with nice relational tabular types of data. Internally, up to 85% of information within an organisation is considered unstructured. This includes stuff that is being dealt with reasonably well such as semi-structured content (XML) to things that organisations continue to struggle with such as free form text, audio and video. Externally, the rise of social media presents tremendous opportunities in terms of getting to know your customers better but it also presents unique challenges in making sense out of what often involves large amounts of highly unstructured text.
- COMPLEXITY: With additional source of data comes extra complexity in trying to make sense out of it. There are now greater needs to link, match, and transform data across business units, systems, subsidiaries and external partner organisations. One example is the need to drive "Single Customer View" across the enterprise. The complexity involved in getting a 360 degree view of your customer is creating completely new categories of software such as MDM.
- VELOCITY: Not only is the data coming in larger volume, with extra complexity and variety, it is also coming in at greater speed. Initiatives such as the use of RFID tags and smart metering data are driving an ever greater need to deal with the torrent of data in near real time. This coupled with the need and drive to be more agile and deliver insight quicker is putting tremendous pressure on organisations to build the necessary infrastructure and skill base to react quick enough.
At SAS, we continue to work with customers on a number of these challenges and believe it is a combination of best practice, process and technology that will help organisation bridge the gap between big data and big insight.
What about you? Do you believe your organisation has a big data problem today or will face it soon?