Continuation of Q&A from the September 19, 2018 ASA web lecture "Why Are Forecasts So Wrong? What Management Must Know About Forecasting."
Why Are Forecasts So Wrong? Q&A (Part 2)
Q: Should we make a distinction between business as usual forecasts and major change forecasts and do FVA for these separately? Management reaction can differ with these.
A: Segmenting your time series based on their characteristics is often helpful in both modeling and forecasting, and in evaluating forecast performance. Many companies have long running, stable products, that can be forecast quite accurately with simple methods (perfect for automatic forecasting software). Other products may have much more volatile demand due to promotions or other reasons.
When you do FVA analysis, the “stairstep” report is a good way to see the results. You can start with an overall report, aggregating results across the full product line. Then you can start segmenting the data, perhaps reporting products by brand or category, or by geography or distribution center, or even by the planner responsible for generating the forecasts. This can help identify areas that may need further investigation by management.
You can also segment all the way down to the individual time series. Particularly problematic / non-value adding time series can easily be identified and the issues addressed.
Q: Is there any research that estimates the degree to which automated forecasting processes end up with over-fit models?
A: I’m not aware of such research, but this would be valuable information. If the automatic model selection is done properly, there should be protections against overfitting. For example, if there is sufficient history a hold-out sample can be used to evaluate performance of the model over the hold out period. There are also metrics like the Akiake Information Criterion (AIC) which deals with the trade-off between goodness of fit and simplicity of the model.
Q: I just want to confirm that moving averages would fall under the naive approach, correct?
A: Forecasts obtained with a minimal amount of effort and data manipulation and based solely on the most recent information available are referred to as “naïve” forecasts. (Definition taken from Makridakis, et al, Forecasting: Methods and Applications (3rd edition, 1998).) Computationally simple methods like moving average or single exponential smoothing are suitable benchmarks for comparing performance of more elaborate models and can be considered naïve (or at least “simple”) models.
However, I would suggest always comparing performance to the random walk (or seasonal random walk), as this lets you know what the simplest model would achieve.
Q: What top two things can we as the statistician do to improve forecasts?
A: Perhaps the first thing is to always remember that your job is to create the best possible forecasts, not just fit models to history. Along with this, remembering that it is always better to start simple, and only add complexity when it can be justified by improved forecasting performance. We cannot become enamored with our favorite pet models and apply them in all circumstances. The simplest approaches often work the best, even if they aren’t new, exciting, or glamorous.
Second, recognize that composite models generally perform better than any single model. So rather than go to heroic efforts tuning and tweaking a single model, you may get better results (and with much less effort) simply averaging the forecasts from several standard models. For example, in the recently completed M4 Forecasting Competition, only 17 of 50 entries performed at least as well as the composite benchmark. And 12 of those 17 were themselves combinations of mostly statistical approaches.