Forecasting new products (Part 4): Query, filter, and cluster

The Query step begins by selecting like-items based on the appropriate product attributes, then reviewing historical sales of past new product introductions.

Continuing with the DVD example, suppose the new release is an R-rated horror movie. For like-items, we would query our database and pull the history of all prior DVD releases with the two attributes: Genre = "Horror" and MPAA Rating = "R".

Note that we are using judgment to determine which attributes are most relevant -- the system doesn't tell us this. But the system does automate the work of extracting R-rated horror movies from the data set of all previous DVD releases, and aligns the history on a common timeline (weeks after release). These extracted items form our pool of candidate products.

The output from the Query step is a graphical overlay of the weekly sales of all candidate products (shown below for the first 20 weeks after release).

Judgment again comes into play in the Filter step. The graph of candidate products points out a peculiar bump in week 5 sales for the movie Dawn of the Dead. If we decide that Dawn of the Dead is an outlier, we can filter it out of the candidate pool.

The Filter step lets us remove candidates we judged as inappropriate for whatever reasons. For example, if the week 5 bump can be explained by a special promotion we ran for Dawn of the Dead, that we don't plan to run for the new release, then it may make sense to remove it.

After filtering is applied, the Cluster step groups the remaining candidates according to similarity of their sales pattern. Below is one of the three main clusters that was formed (with the vertical axis showing % of sales rather than units sold).

Note that the system does not tell us which cluster is most appropriate for forecasting the new product (or whether we should cluster at all or just use every candidate). This step simply provides the option of reviewing clusters, and again using judgment. Whatever we decide to do (use a particular cluster, or instead use all the candidates), we refer to these as our surrogate products.

Next time we'll examine how you might model and generate the new product forecast from your surrogates.

Blogs

Blogs

Forecasting new products (Part 4): Query, filter, and cluster

About Author