Would you rather analyze or prepare?


Are you one of those people that love doing analysis? If so, there is nothing quite like analyzing customer patterns, revenue trends, inventory levels, cost optimization, etc. -- especially when you can use that analysis to make changes that will optimize your business. But before you can start the analysis, you have one small hurdle to overcome: data preparation. And for many of us, getting data ready to analyze is the part that that we dread.

There are many obstacles that complicate the data preparation process:

  • Data distributed across multiple systems or geographies.
  • Massive amounts of data.
  • Transactional data that is not formatted for analysis.
  • Data of all different types and formats.
  • Overlapping, redundant or semi-trusted data.
  • Ownership and access issues.

Then once you have completed the data preparation step, how do you keep it up to date? Or do you repeat the process every time you start the analysis process? Unfortunately there is no silver bullet, but we do have a number of resources that will help guide your efforts including the following:
Data Integration - An Insider's Perspective Webinar, March 16th at 1 PM EST. Join Joseph Randazzo, Manager of Actuarial Services at Excellus BlueCross BlueShield, and Scott Chastain, from SAS Americas Technology Practice as they discuss how Excellus has been able to reduce the amount of time spent on scrubbing, extracting, transforming and loading data so that they can spend more time on analysis. Excellus has been able to accomplish this even though their environment deals with more than 100 million member and claims records relating to 1.7 million members and includes multiple legacy systems spanning four separate regional organizations.

Analytical Data Preparation 101 – Check out our whitepaper featuring Philip Russom, Senior Manager of TDWI Research, that addresses the top key ideas relating to analytical data preparation:

  • Distinguish between data warehouses, data marts and analytics databases. Know the differences among them to determine which ones best suit your organizational needs and how best to optimize the repository for analytical purposes.
  • Design a data warehouse architecture that accommodates analytics. Should it be embedded in your enterprise data warehouse or stand alone? Consider the pros and cons of several architectural approaches.
  • Prepare data to meet the needs of the analytic method you have chosen. There are key factors to consider in preparing data for online analytical processing (OLAP) vs. query-based analytics vs. predictive analytics.
  • Preserve analytical data’s rich details, because they enable discovery. Raw and unstructured data can be a gold mine for details that reveal facts, relationships, clusters and anomalies.
  • Improve analytical data after working with it, not before. Standardizing and cleansing the data too much or too soon can inhibit the insights drawn from it.

If you are interested in starting a dialogue about your experience, or if you want to share best practices or lessons learned, please add a comment on this blog post or feel free to start a discussion discussion thread.


About Author

Mark Troester

IT / CIO Thought Leader & Strategist

Mark Troester is the IT / CIO Thought Leader & Strategist for SAS. He oversees the company’s market strategy efforts for information management and for the overall CIO and IT vision. He began his career in IT and has worked in product management and product marketing for a number of Silicon Valley start-ups and established software companies. Twitter @mtroester

Leave A Reply

Back to Top