The SAS model factory – a big data solution

1

Do you have too many models to build, too many to manage, too few analytic resources or too much data?  A Model Factory may be your answer.

The mindset of analytics is changing.  This represents the transformation from a “craftsman” dominated culture in which multiple weeks were spent cycling through data and developing a model; to a production-oriented environment where analytically derived information almost instantaneously follows the strategic conceptualization of ideas.

This transformation is significantly accelerated by the integration of the SAS Model Factory.

The idea of a “Model Factory” may make one reminisce of a mechanical age of smokestacks and assembly lines.  When Henry Ford revolutionized the car making process by introducing the assembly line – the process that is still used worldwide in auto manufacturing today – he laid the foundation for the democratization of the car. This assembly line reduced the cost of making a car to an amount that made it sellable to a much larger audience.

What do we really mean by Model Factory?

A factory is defined as where something is made or assembled quickly and in great quantities.

A model factory is defined as where predictive models are automatically built quickly and in great quantities enabling an automated scoring process.

Why would you use a Model Factory?ModelFactory

  • Perhaps you have limited technical and/or analytic resources.
  • You have too many models to build and manage because you have various target variables and/or you segment your customers prior to modeling.
  • If you have 1000’s of customer attributes, you may need to select only a subset that is appropriate for each model.
  • Perhaps you need to perform repetitive data preparation with variable transformations, handling of missing values, etc.
  • You have Big Data which slows down model building and scoring.
  • In brief, you are unable to build models fast enough.

Can the model factory process be automated?

It consists of:

  • Model Initiation
  • Model Development
  • Model Deployment
  • Model Monitoring
  • Model Recalibration/Rebuild
  • Model Retirement

From a Factory Perspective, it looks like:

sas model factory

You choose to write a code-based Model Factory

You can use Base SAS and SAS/Stat with the High Performance Procedures to enable 100’s or 1000’s of models to be built automatically on as much data as you have.  With the needed code, your data will be structured properly.  Transformations, and missing values will be automatically handled.  Good enough models will be built.  And, no analytical skills will be needed to run the process.

Model Factory Deployment

  • Run Macro Driven Code
  • Parameter file

–      Manual entry
–      Point-and-Click entry

  • Code processes parameter file and data
  • Code runs analytic models
  • Model Factory code produces Scoring code

SAS has other solutions for model building

If you have fewer models to build and/or you have the needed analytic resource for model development, these Point-and-Click solutions may be sufficient:

  • Enterprise Miner
  • Rapid Predictive Modeler – run from Enterprise Guide

What can you do to build a Model Factory?

  • Take classes in Data Mining techniques
  • Read documents about data mining
  • Have internal working meetings to review goals and desired results
  • Engage consultants

In summary, we understand that you have experienced the chaos associated with building and maintaining a multitude of models.  The solution to your modeling problems may be the Model Factory Solution which replaces the chaos with automation, efficiency, and repeatability.  For more information, you may contact the author.  For more on this topic, attend the SAS Model Factory pre-conference workshop at Analytics 2014 in Las Vegas on Sunday, October 19, 2014, 1-5 pm.

Share

About Author

Darius Baer

Advisory Analytical Consultant

Darius Baer, Ph.D. is an Advisory Analytical Consultant with SAS Institute. He has over 34 years of SAS experience using statistical methods to solve executive driven business problems for retail, pharmaceuticals, manufacturing, telecommunications, finance, government, and others. He currently facilitates developing and implementing strategies for solving business problems with SAS analytics focusing on segmentation, predictive and descriptive modeling, forecasting, and optimization. Darius has spent the last fourteen years architecting and implementing consulting solutions. Darius holds a Ph.D. and M.A. in Behavioral Genetics as well as a B.A. in Mathematics from the University of Colorado in Boulder, Colorado.

Back to Top