As we saw last time with Steve Morlidge's analysis of the M3 data, forecasts produced by experts under controlled conditions with no difficult-to-forecast series still failed to beat a naive forecast 30% of the time.
So how bad could it be for real-life practitioners forecasting real-life industrial data?
In two words: Pretty bad.
The New Study
Morlidge's nine sample datasets covered 17,500 products, over an average of 29 (weekly or monthly) periods. For these real-life practitioners forecasting real-life data, 52% of forecasts had RAEs above 1.0.
FIFTY TWO PERCENT
As he puts it, "This result distressingly suggests that, on average, a company's product forecasts do not improve upon naive projections."
Morlidge also found that only 5% of the 17,500 products had RAEs below 0.5, which he has posited as a reasonable estimate of the practical lower limit for forecast error.
What are we to make of these findings, other than gnash our teeth and curse the day we ever got ourselves suckered into the forecasting profession? While Morlidge's approach continues to receive further vetting on a broader variety of datasets, he itemizes several immediate implications for the practical task of forecasting in the supply chain:
1. RAE of 0.5 is a reasonable approximation to the best forecast that can be achieved in practice.
2. Traditional metrics (e.g. MAPE) are not particularly helpful. They do not tell you whether the forecast has the potential to be improved. And a change in the metric may indicate a change in the volatility of the data, not so much a change in the level of performance.
3. Many forecasting methods add little value.
On the positive side, his findings show that there is significant opportunity for improvement in forecast quality. He found the weighted average RAE to be well above the lower bound for forecast error (RAE = 0.5). And roughly half of all forecasts were worse than the naive forecast -- error which should be avoidable.
Of course, we don't know in advance which forecasts will perform worse than the naive forecast. But by rigorous tracking of performance over time, we should be able to identify those that are problematic. And we should always track separately the "statistical forecast" (generated by the forecasting software) and the "final forecast" (after judgmental adjustments are made) -- a distinction that was not possible in the current study.
...it is likely that the easiest way to make significant improvement is by eliminating poor forecasting rather than trying to optimise good forecasting. (p.31)
[You'll find a similar sentiment in this classic The BFD post, "First, do no harm."]
Hear More at Analytics 2014 in Frankfurt
Join over 500 of your peers at Analytics 2014, June 4-5 in Frankfurt, Germany. Morlidge will be presenting on "Forecasting Value Added and the Limits of Forecastability." Among the 40 presentations and four keynotes, there will also be forecasting sessions on:
- Forecasting at Telefonica Germany
- Promotion Forecasting for a Belgian Food Retailer, Delhaize
- The New Analytical Mindset in Forecasting: Nestle's Approach in Europe and a Case Study of Nestle Spain
- Big Data Analytics in the Energy Market: Customer Profiles from Smart Meter Data
- Shopping and Entertainment: All the Figures of Media Retail
In addition, my colleague Udo Sglavo will present "A New Face for SAS Analytical Clients" -- the forthcoming web interface for SAS Forecast Server.