David Linthicum wrote a blog post entitled "3 winners, 3 losers in the move to big data" on InfoWorld and notes that traditional vendors "did not see this coming" (big data that is). Since David made some interesting points, some of which I agree with, some I disagree, I felt it worthwhile to provide my perspective. Here is my response to his blog post...
Hello David –
It’s been a while since we connected – our discussions about SOA, data federation, and database connectivity seem like such a distant memory! I’m glad that you are addressing big data and analytics topics in the context of cloud – I find your comments to be interesting, if not somewhat provocative.
Although I agree that the big data phenomena will be disruptive to some traditional vendors… at SAS, we are seeing a huge benefit based on the interest in big data.
- The hype has helped communicate the fact that analytics is not just about BI and reporting. In addition to historical reports and dashboards, analytics can be used to view things predictively and can be used to optimize operations based on those predictions.
- We are also seeing a better understanding of the benefits of integrating operational and analytic systems. Since big data is being driven by additional transaction data as well as contextual data (such as social media), people are starting to think about them together. We talk about the entire data to decision lifecycle where analytics are embedded directly in operational or transactional systems. For example, analyzing every single credit card swipe with rich analytics, or leveraging analytics to determine the best call script to drive a CRM-based interaction with salesforce.com
- Companies that are already leveraging advanced forms of analytics such as predictive analytics and optimization, are now able to leverage new forms of data and build more robust analytical models. They are able to develop and run these models in a fraction of the time that it used to take, which means they can run additional “what if” models, factoring in additional variables while analyzing entire datasets. We now have retailers that are able to optimize pricing at the individual sku and store level with high frequency vs. applying costly mark-downs at the category and region level. We have banks that can better optimize their risk portfolios because they can analyze all of their customer data, and all of their transaction data, along with social media data – without the need to sample the data. We are supporting intelligence and law enforcement efforts by supporting an innovative “stream it, score it, store it” approach that leverages rich analytics to decipher the 1% of relevant data that streams through their organization up front. Analytics are applied at the front end of the information continuum vs. storing and then analyzing.
From a technical perspective, we have long since taken the approach to leverage any database technology that is available, regardless of the license type. We have taken a very aggressive stance in terms of developing new technologies – big data is simply not a new concept for us – we have leveraged distributed technologies such as grid, we’ve long since moved the processing to the data with our in-database approach, and our latest advancement is leveraging in-memory. Our in-memory approach is different from database vendors since we leverage an in-memory analytical engine – this in-memory approach is built specifically for analytics vs. data storage, and can be leveraged by the same infrastructure that supports the databases, including EMC Greenplum, Teradata, and Hadoop. In addition, our high performance analytics capabilities support multiple architecture patterns – from visualization capabilities that support a large number of users to high-end analytic modeling work that not only accommodates big data, but allows for the efficient management of many, complex analytical models.
Since you touched on Hadoop, and since it is all the rage, our high performance story also leverages and supports Hadoop. As with other databases, we support the ability to 1) leverage Hadoop data in any of our analytical products and 2) manage data that is in Hadoop using our data management solutions, which include data integration, data quality, MDM and data governance capabilities. We support the ability to author Hadoop code in HDFS, Hive, Pig, and MapReduce in our graphical development environment, and it is possible to create job flows that mix processing capabilities from SAS as well as Hadoop. In addition, we use Hadoop as the persistent storage mechanism in our Visual Analytics product – this enables fast loading of data into memory which is used by a visual tool to instantaneous present millions of rows of data. This capability also supports mobile display through devices such as the iPad.
Our cloud business is leading the way in terms of our revenue growth – again, this is nothing new. SAS OnDemand has provided a cloud based option for customers many years. This includes the ability to leverage SAS solutions as well as our business analytics capabilities that span information management, business intelligence and analytics. SAS customers have the flexibility to turning the entire operation over to SAS, or leverage the SAS infrastructure while still being involved in analytical modeling, data preparation, etc.
You may consider SAS a traditional vendor, but we actually see the big data trend as industry hype that is catching up to what we have done all along. We look forward to leveraging Hadoop and other emerging technologies as vehicles that will help us improve our clients ability to make decisions that drive competitive advantage.
IT/CIO Thought Leader & Strategist, SAS