Big data is of no use unless you can turn it into information and insight. For that you need big analytics. Every piece of the analytics cycle has been impacted by big data, from reporting, with the need to quickly render reports from billions of rows of data, through advanced analytics like forecasting and optimization, which require complex math executed by multiple passes through the data set.
Without changes to the technology infrastructure, analytic processes on big data sets will take longer and longer to execute. It’s not enough now to push the button and wait hours or days for an answer. Today’s advanced analytics need to be fast and they need to be accessible. This means more changes to the technology infrastructure to support these new processes.
Analytics companies like SAS have been developing new methods for executing analytics more quickly. Below is a high level description of some of these new methodologies, including why they provide an advantage. Once again, the intention is to provide enough detail to start conversations with IT counterparts (or understand what they are talking about), certainly not to become an expert. There is a ton of information out there if you want more detail!
- Grid computing and parallel processing – Calculations are split across multiple CPUS to solve a bunch of smaller problems in parallel, as opposed to one big problem in sequence. Think about the difference between adding a series of 8 numbers in a row versus splitting the problem into in four sets of two, and handing them out to four of your friends. To accomplish this, multiple CPUs are tied together, so the algorithms can access the resources of the entire bank of CPUs.
- In-database processing - Most analytic programs lift data sets out of the database, execute the “math” and then dump the data sets back in the database. The larger the data sets, the more time consuming it is to move them around. In-database analytics bring the math to the data. The analytics run in the database with the data, reducing the amount of time-consuming data movement.
- In-memory processing – This capability is a bit harder to understand for non-technical people, but it provides a crucial advantage for both reporting and analytics. Large sets of data are typically stored on the hard drive of a computer, which is the physical disk inside the computer (or server). It takes time to read the data off of the physical disk space, and every pass through the data adds additional time. It is much faster to conduct analysis and build reports from the computer’s memory. Memory is becoming cheaper today, so it is now possible to add enough memory to hold “big data” sets for significantly faster reporting and analytics.
To give you an idea of the scale of the impact, applying these methodologies, we have been able to render a summary report (with drill down capability) from a billion rows of data in seconds. Large scale optimizations like risk calculations for major banks, or price optimization for thousands of retail products across hundreds of stores, have gone from hours or days to minutes and seconds to calculate. As you can tell, the advantages are tremendous. Organizations can now run analytics on their entire data set, rather than a sample. It is possible to run more analyses more frequently, testing scenarios and refining results.
Here are some examples of how innovative companies are applying big analytics to get value from their big data:
- Airline companies are incorporating the voice of the customer into their analyses, by mining all of the internal and external unstructured text data collected across channels like social media, forums, guest surveys, call center logs, and maintenance records for passenger sentiment and common topics. With big text analytics, these organizations are able to analyze all of their text data, as opposed to small samples, to better understand the passenger experience and improve their service and product offerings.
- A major retailer is keeping labor costs down while maintaining service levels by using customer traffic patterns detected by security video to predict in advance when lines will form a the register. This way, staff can be deployed to various stocking tasks around the store when there are no lines, but given enough notice to open a register as demand increases, but before lines start to form.
- A major hotel company has deployed a “what if” analysis in their revenue management system which allows users to immediately see the impact of price changes or forecast overrides on their demand, by re-optimizing around the user’s changes. Revenue managers no longer have to make a change and wait until the overnight optimization runs.
Unlocking the insights in big data with big analytics will require making some investments in modernizing technology environments. The rewards for the investment are great. Organizations that are able to use all that big data to improve the guest experience while maximizing revenue and profits will be the ones that get ahead and stay ahead in this highly competitive environment!