Just as we all eagerly awaited announcement of the $1 million prize winner of the Pillsbury Bake-Off(R), every forecasting software vendor has endured the "bake-off" hosted by organizations in the market for new forecasting software.
Software selection teams utilize a bake-off to help evaluate competing vendors. Vendors are given a sample of the organization’s historical data and asked to generate forecasts. Sometimes the most recent data is withheld, and the forecasts are made over the withheld data. A better (although more time-consuming approach) is to have the vendors forecast future periods, and then patiently wait-and-see the results as actuals roll in. The purpose of the bake-off is to gain insight into the expected forecasting performance from each vendor.
Rob Stevens, Analytical Consultant in SAS Global Professional Services, has suggested guidelines for running a fair and informative bake-off, emulating real-world circumstances. (It is easy for a bake-off to be “fixed” in favor of a preferred vendor, so these guidelines should be adhered to.):
• Provide all necessary information (including historical demand data and events).
• Provide sufficient history for vendors to be able to utilize holdout samples for evaluating their models.
• Give vendors sufficient time to analyze your data—don’t force them to take shortcuts that you would not take in real-life forecasting.
• Assist vendors with the domain experience they may lack regarding your business—perhaps by having a project team member assigned to support the vendors during the bake-off.
• Do not “fix” the bake-off with arbitrary rules to put favored vendors at an advantage. (Not only is this completely unethical but, it prevents you from getting a fair assessment of each vendor’s performance.)
• Be aware that a bake-off is a one-shot event, and the results may be somewhat due to chance. Better forecasting systems evolve over time, incorporating learnings from each forecasting cycle.
• Utilize appropriate evaluation metrics—not just focusing on error (MAPE) or accuracy. Also consider the bias in the competing forecasts, and the “value added” compared to using a naïve model.
• Be aware that simple models are often better at forecasting the future, even though they may not fit the past as well as more complex models. Access to a wide range of model choices is good, but fancier models will not guarantee better forecasts. Focus on the quality of the future forecasts, not on the model fit to history.
• Beware of organization politics that may contaminate the bake-off evaluation.
Note that a bake-off is no substitute for a full-scale Proof-of-Concept/Proof-of-Value that demonstrates the software’s performance with the organization’s full data. A proper POC/POV can require a significant commitment of time and resources from the vendor, and the selection team should budget to pay for the service provided.