Brilliant forecasting article from 1957!!! (Part 3)

This isn't such a brilliant article because we learn something new from it -- we really don't. But it is amazing to find, from someone in 1957, such a clear discussion of forecasting issues that still plague us today. If you can get past some of the Mad Men era words and phrasing, the article is wonderfully written and a fun read -- full of sarcastic digs at the forecasting practice.

In this final installment we'll look at Lorie's handling of forecasting performance evaluation.

Problem 2: The Evaluation of Forecasts

Lorie states there are two main problems in evaluating forecasting:

Determining accuracy.
Determining economic usefulness.

To solve these, he suggests three principles:

A. The Superiority of Written Forecasts

When forecasts are not recorded, the usual consequence is that they "seem to become more and more accurate as they recede into the past where memory is inexact and usually comforting." But even when written down, there is danger of ambiguity.

Lorie takes special aim at financial analysts and economic forecasters, who find it "distressingly easy" to use broad designations like "markets" or "business activity" or "sales." Of course, without a rigorous operational definition of such terms, the accuracy of the forecasts cannot be judged. "Their usefulness, however, can; their usefulness is negligible."

Lorie's position is largely in line with Nate Silver's recent critique of economic forecasting as an "almost complete failure."

In addition to recording forecasts in a way specific enough to be measured (typically product, location, time period, units), Lorie argues for recording the method used to generate the forecast:

The absence of a record of the forecasting method makes it extremely difficult to judge what has been successful and what unsuccessful among the techniques for peering into the future.

By method, I will interpret this to mean, at a high level, what forecasting process was used. For example,

STATISTICAL FORECAST ==> ANALYST OVERRIDE ==> CONSENSUS OVERRIDE

Over time we can determine whether these individual steps are making the forecast any better (or worse) than using a simple naïve model.

B. The Statistical Evaluation of Forecasting Techniques

Today there is growing recognition that relative metrics of forecasting performance are much more relevant and useful than the traditional accuracy or error metric by itself.

For example, to be told "MAPE=30%" is only mildly interesting. By itself, MAPE gives no indication of how easy or difficult a series is to forecast. It doesn't tell us what error would be reasonable to expect for the given series, and consequently, does not tell us whether our forecasting efforts were good or bad.

It is only by viewing the MAPE in comparison to some baseline of performance (e.g., the MAPE of a naïve forecast), that we can determine the "value added" by our forecasting efforts. This is what relative metrics such as FVA let you do.

Lorie gives an example: Each day the weather forecaster in St. Petersburg, Florida can forecast the following day's weather to be clear and sunny, and by doing nothing will be correct 95% of the time. The forecaster in Chicago, even using the latest technology and most sophisticated methods, will only be get the following day's forecast right 80% of the time. So does this mean the St. Petersburg forecaster is more skilled at his profession than the Chicago forecaster? Of course not!

If there is a point to the preceding example, it is that the statistical evaluation of forecasting techniques must take account of the variability of the series being forecast...the forecasting task in Chicago is much more difficult.

What is desired is measurement of the "marginal" contribution of the forecasting technique. What is desired is an indication of the extent to which one can forecast better because of the use of the forecasting technique than would be possible by sole reliance on some simple, cheap, and objective forecasting device.

Lorie has provided an almost perfect description of FVA analysis. In essence, it is nothing more than the application of basic scientific method to the evaluation of a forecasting process.

C. The Economic Evaluation of Forecasts

There can be asymmetry in the cost of our business decisions, that is clearly true. For example, it makes sense to carry excess inventory on an item which costs us little to make and hold, yet yields huge revenue when sold. (Carrying too little inventory might save us a little on cost, yet we'd miss a lot of revenue on lost sales.)

Lorie asserts:

A forecasting technique is to judged to be superior to alternatives according to an economic evaluation if the consequences of decisions based upon it are more profitable than decisions based upon the alternatives.

This seems to be saying that it is ok to bias your forecasts in a direction that is more economically favorable, but I disagree. While it is appropriate to bias your plans and actions in a way that will provide a more favorable economic outcome (as in the example above), I would contend that the forecast should remain an "unbiased best guess" at what is really going to happen in the future.

I'm not convinced there can be an economic evaluation of the forecast. (Evaluating a forecast solely on accuracy, bias, and FVA may be sufficient). However, there should be an economic evaluation of the decision that was made.

Blogs

Blogs

Brilliant forecasting article from 1957!!! (Part 3)

Problem 2: The Evaluation of Forecasts

About Author

1 Comment