Machine Learning at Scale with SAS and Cloudera


MachineLearning_2Imagine being able to get into your car and say “Take me to work.” Then, it automatically drives as you read the morning paper.  We’re not there yet. But we’re closer than you think. Google has already developed a prototype for a driverless car in the U.S.  Driverless cars are just one example of machine learning. It’s used in countless applications including those that predict fraud, identify terrorists, recommend the right products to customers at the right moment and correctly identify a patient’s symptoms in order to recommend the appropriate medications.

The concept of machine learning has been around for decades. What’s new is that it can now be applied to huge quantities of complex data. Less expensive data storage, distributed processing, more powerful computers, and the analytical opportunities available have dramatically increased interest in machine learning systems.

Machine learning focuses on the construction and study of systems that can learn from data. The goal is to develop deep insights from data assets faster, extract knowledge from data with greater precision, improve the bottom line and reduce risk.

Considerable overlap exists between statistics and machine learning. Both disciplines focus on studying generalizations (or predictions) from data.  A big difference between statistics and machine learning, is that statistics focuses more on inferential analysis to make predictions about a larger population than the sample represents. Statistics also looks at things like parameter estimates, error rates, distribution assumptions and so forth to understand empirical data with a random component.

Naturally you want a scalable machine learning platform that provides enterprise ready storage, data processing, management along with the analytics. The deep partnership of Cloudera and SAS provides modern distributed analytical products such as SAS In-Memory Statistics for Hadoop and SAS Visual Statistics collocated with your CDH5.0 cluster.   

To learn more about Big Data and machine learning with SAS and Cloudera, check out the panel presentation and round table discussion on Hadoop at Analytics 2014 in Las Vegas!

  • Panel discussion with SAS and Cloudera on Big Data and Hadoop: Moving beyond the hype to realize your analytics strategy with SAS® - Monday, October 20, 3:00-3:50 pm
  • Round Table discussion on Practical Considerations for SAS Analytics in a Hadoop Environment – Tuesday, October 21, 12:30-1:45 pm

You can also check out our starter services on Visual Analytics and Visual Statistics and the Expert Exchange for Hadoop.


About Author

Wayne Thompson

Manager Data Science Technologies

Wayne Thompson, Chief Data Scientist at SAS, is a globally renowned presenter, teacher, practitioner and innovator in the fields of data mining and machine learning. He has worked alongside the world's biggest and most challenging organizations to help them harness analytics to build high performing organizations. Over the course of his 24 year tenure at SAS, Wayne has been credited with bringing to market landmark SAS analytics technologies, including SAS Text Miner, SAS Credit Scoring for Enterprise Miner, SAS Model Manager, SAS Rapid Predictive Modeler, SAS Visual Statistics and more. His current focus initiatives include easy to use self-service data mining tools along with deep learning and cognitive computing tool kits.

Related Posts

Comments are closed.

Back to Top