Data management has never been the shiny object that caught the imagination of the mainstream. And let’s be honest, it's not nearly as interesting as analytics, machine learning or artificial intelligence. In fact, entire movies get created about analytics, and people actually pay to see them! Data management? Not so much.
But what data management lacks in glamor, it more than makes up for it as a top concern for organizations worldwide. According to the World Economic Forum, it's predicted that in 2020 the amount of data we produce will reach a staggering 44 zettabytes. On a daily basis we'll send 294 billion emails and tweet out over 500 million times. It's all data, and it all needs to be managed.
While the term ‘big data’ is now regarded as passé, the goal of getting better and tighter insights from the data is more relevant than ever. The promise of big data never came from simply having more data – and from more sources – but from developing analytical models that learn and improve with the more data they consume. A lot of work is being done to advance models and put them into production -- but it's all for naught if organizations don't have a strong data management program in place.
Data management solutions at SAS
At SAS, our data management solutions have been developed not only with operational use cases in mind, but also to support our customers’ analytics investments.
SAS has always provided strong data management solutions that include:
- Access. Get read and write access to data stored on different platforms and in many different forms – like relational database systems, flat files, etc.
- Integration. With enterprise-level tools that support extraction, movement, transformation and loading of data between systems, you can migrate and blend data between nearly any type of platform, database or file format.
- Cleansing. Standardize, cleanse, enrich and correct data to support various business needs – from legal compliance to data address standardization to uniform content formatting.
- Governance. Create and manage data definitions, rules and policies associated with accessing, sharing and using data – and let users monitor, measure and interact directly with the data to resolve issues.
As we move further into 2020, data management will continue to advance and develop efficiencies to make the job of having data ready for business purposes faster and more reliable than ever. While data management is a diverse field in its practices, there are three primary areas that will be of key importance in the months to come:
- Data orchestration. The uniting of data integration, API integration and data movement to support DataOps techniques. This involves combining multiple technologies to deliver a single data flow application to coordinate data-related activities across varied locations on-premises or in the cloud.
- Data discovery. Acknowledged as the glue to enterprise software, the delivery of a common catalog for finding, provisioning, securing and understanding data and other objects is important to all organizations. Furthermore, the application of advanced analytics delivers the ability to automate mundane data management tasks and find value in data that previously had been too difficult to discern.
- Automating data preparation. To expand data manipulation activities to a wider audience, the development of advanced data transformation using AI to automate cleansing and blending will empower non-technical users.
To meet the orchestration, discovery, and preparation needs of our customers, SAS is investing in new features and products to augment its comprehensive data management suite. For example, SAS is developing an information catalog that will index and profile your data assets. To facilitate ease of use, a natural language, ranked search will quickly locate content that is optimized for a task. Content is automatically collected and profiled when it enters the asset catalog so that you can discover new areas that you may not have been aware of, and social networking allows users to collaborate with each other on data. Metadata exchange is also supported between and across diverse systems through SAS’ participation in the Linux Foundation’s Open Metadata and Governance (ODPi) initiative, which is an open platform for big data governance. The information catalog will be a huge time saver when you need fast access to high-quality data.
Another significant area of development is in data orchestration. SAS’ patented multi-lingual code generation capabilities are expanding to include data orchestration across different runtime environments. With runtime optimization, data preparation tasks can be pushed down to where the data lives so that it's prepared in advance in the data platform, avoiding costly data movement. These capabilities facilitate big data ETL so that you can deliver data faster and more efficiently to analytics or reporting projects.
Finally, SAS is leveraging AI to enable automation of data management tasks. SAS delivered its first cognitive data management capabilities in 2019. SAS’ recommendation engine automatically profiles your data, detects problems such as duplicates, outliers and missing values, and makes recommendations to improve it. The recommendations improve with use. SAS is also investing in AI for data quality that can automatically discover where data exists, such as names and addresses, or sensitive data such as bank and government ids. These features help you identify important content, as wells as comply with auditing and privacy rules. Ultimately these features help you to demonstrate to your customers that you are a trustworthy steward of their data.
To learn more about SAS and how to take your past and current data management practices into the future, check out this white paper: 5 Data Management for Analytics Best Practices.