We all have some sort of intuitive idea of what time series data is – it’s a bunch of measurements or observations that are marked by a time stamp – we know when the measurement was taken, as well as what was measured. This natural temporal ordering of the data is a vital component for statistical forecasting: we are searching for patterns that change systematically over time. In order to do a good job in forecasting we need to know about the structure of our data.
Ideally, the time series would look like this: well behaved, few missing values, ample history, a nice repeating pattern – this is the sort of data that can be fed to an automated forecasting environment such as SAS Forecast Server, and good results obtained. But life – and forecasting – isn’t always that neat. In almost every forecasting project I have been involved with we were faced with all kinds of not-so-well-behaved series: slow moving items (intermittent demand), new items (no or very short amount of history), end-of-life-cycle items (demand drops to 0), fashion items (short history), items which are only sold during a particular time during the year (think Halloween costumes). Is it reasonable to assume that a one-size-fits all approach will come up with good forecasts for all these different series? I don’t think so.
You have heard about the good old saying that 80% of the time for any analytical task is spent with data preparation tasks. I would argue that all decisions about how to structure and segment your time series data are as important as the forecasting methods themselves in creating the best possible forecast. Note that this goes beyond traditional tasks like missing value replacement, transformations and outlier detection. Each type of demand pattern described above will require a different model strategy, so it makes a lot of sense to separate the data into distinct types or segments. For example
- Complete series – apply robust time series forecasting methods such as ARIMA models.
- sparse data – apply intermittent demand forecasting methods such as Croston’s method or consider this question as an inventory policy problem (instead of asking how much demand are we going to have, calculate appropriate amount of items to buffer against the risk of running out of stock)
- short data – forecast using similarity analysis techniques
Other, more complex strategies can be applied: segment by seasonality pattern, segment by contribution to revenue, or segment by “forecastability”. For each of the segments different SAS Forecast Server projects can be created and different modeling strategies can be applied. To support the forecasting analyst with this task we have released SAS Time Series Studio – a new Graphical User Interface in SAS Forecast Studio 12.1.
SAS Time Series Studio is designed to do:
- Explore time series data interactively
- Explore multiple time series simultaneously
- Explore impact of setting up different hierarchies
- Explore impact of different time intervals
- Segment data based on findings
My colleague Meredith John and I had the pleasure of introduction SAS Time Series Studio at A2012 in Las Vegas. In our presentation we discussed the importance of understanding the structure of time series data for generating forecasts. I encourage of users of SAS Forecast Server (and those who will be) to check out SAS Time Series Studio themselves and share their experiences with us. In fact, you might to twitter you comments using #sastss.