"The Role of Model Interpretability in Data Science" is a recent post on Medium.com by Carl Anderson, Director of Data Science at the fashion eyeware company Warby Parker. Anderson argues that data scientists should be willing to make small sacrifices in model quality in order to deliver a model that is easier to interpret and explain, and is therefore more acceptable to management.
Can we make this same argument in business forecasting?
What is Meant by Model Interpretability?
A model that is readily understood by humans is said to be interpretable.
An example is a forecast based on the 10-week moving average of sales. Management may not agree with the forecast that is produced, but at least everyone understands how it was calculated. And they may make forecast overrides based on their belief that next week's sales will be higher (or lower) than the recent 10-week average.
A trend line model is also easily understood by management. While the mathematical calculations are more complicated, everyone understands the basic concept that "the current growth (or decline) trend will continue."
Anderson observes that an interpretable model is not necessarily less complex than an uninterpretable model.
He uses the example of a principal components model he created to predict new product demand. But principal components is not something readily explainable to his business audience. So he ended up creating a more complex model (i.e., having more variables) that had no additional predictive power, but could be understood by his users.
...the idea of my model is to serve as an additional voice to help them [demand planners] make their decisions. However, they need to trust and understand it, and therein lies the rub. They don’t know what principal component means. It is a very abstract concept. I can’t point to a pair of glasses and show them what it represents because it doesn’t exist like that. However, by restricting the model to actual physical features, features they know very well, they could indeed understand and trust the model. This final model had a very similar prediction error profile — i.e., the model was basically just as good — and it yielded some surprising insights for them.
Increasing model complexity (for no improvement in model performance) is antithetical to everything we are taught about doing science. Whether referred to as Occam's Razor, the Principle of Parsimony, or something else, there is always a strong preference for simpler models.
But Anderson makes a very good point -- a point particularly well taken for business forecasters. We tend to work in a highly politicized environment. Management already has an inclination to override our model-generated forecasts with whatever they please. If our numbers are coming out of an inscrutable black box, they may be even less inclined to trust our work.
Anderson concludes with reasons why a poorer/more complex, but interpretable, model may be favored:
- Interpretable models can be understand by business decision makers, making them more likely to be trusted and used.
- Interpretable models may yield insights.
- As interpretable models build trust in the model builder, this may allow more sophisticated approaches in the future.
- As long as the interpretable model performs similar enough to the better (but uninterpretable) model, you aren't losing much.
While this may not be true in all areas of predictive analytics, a curious fact is that simpler models tend to do better at forecasting. And even a model as simple as the naïve "no change" model performed better than half of the forecasts in a Steve Morlidge study reported in Foresight.
So thankfully, interpretability and performance are not always at odds. But it remains a challenge to keep participants from overriding the forecast with their biases and personal agendas, and just making it worse.