The Perils Revisited
A few posts ago I warned of the perils of forecasting benchmarks, and why they should not be used to set your forecasting performance objectives:
- Can you trust the data?
- Is measurement consistent across the respondents?
- Is the comparison relevant?
In addition to a general suspicion about unaudited survey responses, my biggest concern is the relevance of such comparisons. If company A has smooth, stable, and easy-to-forecast demand, and company B has wild, erratic, and difficult-to-forecast demand, then the forecasters at these two companies should be held to different standards of performance. It makes no sense to hold them to some "industry benchmark" which may be trivial for company A to achieve, and impossible for B.
Perhaps the only reasonable standard is to compare an organization's forecasting performance against what a naive or other simple model would be able to achieve with their data. Thus, if a random walk model can forecast with a MAPE of 50%, then I should expect the organization's forecasting process to do no worse than this.
If the process consistently forecasted worse than a random walk, we know there must be something terribly wrong with it!
Benchmark Study on Forecasting Blog
One of the forecasting blogs I enjoy is the aptly named Forecasting Blog, published by Mark Chockalingam's Demand Planning, LLC. Last week it reported on results from a forecasting benchmark survey covering (among other things) the forecast error metric used, and forecast error results.
Unsurprisingly, they found that 67% of respondents used MAPE or weighted MAPE (WMAPE) as their error metric. Less commonly used error metrics were % of Forecasts Within +/- x% of Actuals, Forecast Bias, and Forecast Attainment (Actual/Forecast).
The blog also reported Average of Forecasting Error by Industry (e.g. 39% in CPG, and 36% in Chemicals). However, it was unclear how this average error was computed, and I suspect Peril #2 (Is the measurement consistent across respondents?) may be violated.
It is well known that the same data can give very different results even for metrics as similar sounding as MAPE and WMAPE. If different companies are using different metrics to compute their forecast error, I'm not sure how you would combine them into an industry average.