Now that another summer of 12-hour family road-trips to Maine and Ohio, pricey engineering and basketball camps for the kids, and beating the heat at the beach are over, I've taken a fresh look at what people are focused on with their data – and what SAS is providing in the data management space.
First of all, "metadata bridges" isn't Jeff Bridges' younger brother. HAWQ isn't an acronym derived from the military-industrial complex. And gender analysis is determining a "male" or "female" designation from a customer name in a data table. So, what do these terms have in common? They are all ways that SAS is responding to the challenges our customers are facing in handling their data.
This summer, while many of us were enjoying vacations, SAS released new versions of several data management products:
- SAS® Data Loader for Hadoop.
- SAS® Data Management.
- SAS® Data Quality.
- SAS® Data Governance.
- SAS/Access® Interface to Oracle, SQL Server, DB2, Hadoop, Impala, Teradata, Greenplum and PC/Files.
With this latest release, SAS continues to broaden the tools available to those who want to do more with data but were traditionally kept outside the process. This technology puts more self-service functionality into the hands of business analysts and data scientists, helping them access and profile more data sources faster. They can then apply additional ways to cleanse their data, freeing up IT to focus on other areas of the business. And, IT can get a more holistic view of their enterprise data assets and how they relate to each other.
Here are the top three things to know about these releases:
1) Improved enterprise metadata management
Thanks to the SAS Meta Bridge Relationship Loader, now part of the SAS Data Management, SAS Data Quality and SAS Data Governance bundles, users get a holistic view of both SAS and third-party metadata.
Why is this a big deal? Because now, users can see how the changes they make to data – whether in an Oracle table or a SAS data set – can affect other data assets in their organization, such as data models, tools or even analytical models. Being able to identify these ripple effects (and make adjustments along the way) will result in more accurate analytics and reporting. This capability also provides agility, so organizations can make changes more rapidly to their data architecture.
2) Improved big data integration
With the latest release of SAS Data Loader for Hadoop, users can import CSV files from Excel and other delimited files into Hadoop without writing code – a huge time-saver and a key self-service capability for data scientists and business analysts. SAS now supports additional Hadoop distributions like MapR.
In the SAS Data Integration Studio component of SAS Data Management, ETL developers can more easily create data integration jobs and use metadata from Pivotal HAWQ and OSIsoft PI data sources. Updates also include improved forking, looping and conditional branching logic to take advantage of parallel processors for improved performance and improved batch deployment utility including new support for z/OS.
3) Improved data quality for both structured and unstructured data
SAS Data Loader for Hadoop now allows users to better understand both structured and unstructured data thanks to features and enhancements like unstructured data field extraction, identification analysis, improved profiling performance and additional data quality directives including gender analysis and pattern analysis. All of these improvements generate cleaner, more accurate data, which will yield more precise analytical models – and help organizations make better business decisions.
Download the “Data Integration Déjà vu: Big Data Reinvigorates DI" whitepaper to learn more about big data Integration. And don't forget, SAS offers a downloadable free trial of SAS Data Loader for Hadoop. Download the trial via the SAS Data Loader for Hadoop page on sas.com.