In April, the free trial of SAS Data Loader for Hadoop became available globally. Now, you can take a test drive of our new technology designed to increase the speed and ease of managing data within Hadoop. The downloads might take a while (after all, this is big data), but I think you’ll be pleasantly surprised at how quickly you can manipulate data on Hadoop distributions (offered by our partners Hortonworks and Cloudera).
SAS Data Loader for Hadoop is the latest technology purpose-built and optimized for Hadoop as more of our customers adopt the low-cost, high-capacity way to manage massive amounts of data. Industry estimates that Hadoop usage has increased by 60 percent in the last two years. For our customers, it's important to apply analytics and data management "from," "with," and "in" Hadoop. SAS technologies can read and write data on Hadoop (from), lift data in parallel into memory to run high-performance analytics (with), and run analytical processing inside the Hadoop cluster (in) for improved performance, reduced data movement and improved governance.
In the latest TDWI Data Innovations Showcase, entitled “Self-Service Big Data Preparation in the Age of Hadoop”, Tamara Dull and I respond to questions about data management and data preparation on Hadoop. We also discuss Hadoop shortcomings, the big data skills gap and the importance of data quality and data governance in the world of big data. Finally, we go into detail about SAS Data Loader for Hadoop and how it can help businesses generate value from big data, reduce training costs, and improve productivity of IT, business users and data scientists alike.
What's exciting for us is that we're seeing SAS being used across the spectrum as companies move to Hadoop. We work with organizations doing data discovery and data preparation to make sure that their big data isn't a big pile of ... garbage. Then, our core analytics capabilities help with analytic model development, deployment and monitoring.
SAS Data Loader for Hadoop is a key ingredient in that continuum. It provides a simple way to prepare, integrate, and cleanse big data so that organizations can make decisions that really matter.
How is your organization using Hadoop? And is data preparation in Hadoop a big concern for your organization?