The Spring 2014 issue of Foresight includes Steve Morlidge's latest article on the topic of forecastability and forecasting performance. He reports on sample data obtained from eight business operating in consumer (B2C) and industrial (B2B) markets. Before we look at these new results, let's review his previous arguments:
1. All extrapolative (time-series) methods are based on the assumption that the signal embedded in the data pattern will continue into the future. These methods thus seek to identify the signal and extrapolate it into the future.
2. Invariably, however, a signal is obscured by noise. A “perfect” forecast will match the signal 100% but, by definition, cannot forecast noise. So if we understand the nature of the relationship between the signal and noise in the past, we should be able to determine the limits of forecastability.
3. The most common naive forecast uses the current period actual as the forecast of the next period. As such, the average forecast error from the naïve model captures the level of noise plus changes in the signal.
4. Thus the limit of forecastability can be expressed in terms of the ratio of the actual forecast error to the naïve forecast error. This ratio is generally termed a relative absolute error (RAE). I have also christened it the avoidability ratio, because it represents the portion of the noise in the data that is reduced by the forecasting method employed.
5. In the case of a perfectly flat signal -- that is, no trend or seasonality in the data -- the best forecast quality achievable is an RAE = 0.7. So unless the data have signals that can be captured, the best forecast accuracy achievable is a 30% reduction in noise from the naïve forecast.
6. An RAE =1.0 should represent the worst forecast quality standard, since it says that the method chosen performed less accurately than a naïve forecast. In this circumstance, it might make sense to replace the method chosen with a naïve forecasting procedure. (pp. 26-27)
The M3 Study
In his previous article (Foresight 32 (Winter 2014), 34-39) Morlidge applied this approach to a segment of data from the M3 forecasting competition that was most relevant to supply chain practitioners.
The M3 competition involved 24 forecasting methods from academics and software vendors, and the 334 time series that Morlidge analyzed included no difficult-to-forecast intermittent demand patterns or new products. Yet all of the 24 forecasting methods generated RAEs above 1.0 more than 30% of the time. So nearly 1/3 of the time their performance was worse than a naive model!
Morlidge concluded that the average performance of any forecasting method may be less important than the distribution of its actual performance. He also emphasized that,
...we do not yet have the capability to identify the potential for poor forecasting before the event. It is therefore critical that actual forecast performance be routinely and rigorously measured after the event, and remedial action taken when it becomes clear that the level of performance is below expectations. (p. 28)
In the next installment we'll look at results from the new study.