Ranking basketball teams, using Generalized Regression

The days are getting shorter, and the weather is getting cooler. That means that my favorite time of year is almost here: basketball season. Having spent most of my life on Tobacco Road, I'm not sure that I had any choice but to love playing and watching the game.

As a proud North Carolina State University alum, I’m a college basketball fan first but have always enjoyed the professional game as well. Having studied statistics almost as long as I’ve studied basketball, it’s very cool when I get the chance to combine the two. Now is a good time to look back at the 2014-2015 NBA season with the new season about to start.

The Data

I downloaded results from every game played in the 2014-2015 NBA season, including the playoffs. With these data (available on the File Exchange), I can use the Bradley-Terry model to come up with a strength measure for each team in the league. My colleague Ryan Lekivetz recently wrote a cool post about the Bradley-Terry model (and potato chips!). But just to reiterate, the Bradley-Terry model allows us to model the outcomes of competitions. For basketball games, the model looks like:

Pr( team \(i\) beats team \(k\)) = \(\frac{e^{\beta_j}}{e^{\beta_j} + e^{\beta_k}}\)

Rewriting the model slightly gives us:

Pr( team \(j\) beats team \(k\)) = \(1/(1 + e^{ - (\beta_j - \beta_k)}) \)

Now our model looks like a no-intercept logistic regression problem, and all we have to do is prepare indicator columns as Ryan described in his post. Ryan's post also demonstrates how to launch and fit the model in the Generalized Regression platform in JMP Pro. The estimated \(\beta_j\) give us a measure of strength for every team in the NBA. And since we are using every game played, it means that our model will naturally adjust for strength of schedule.

The nice thing about the Bradley-Terry model is that we can also add covariates. So we don't have to look at just a team's overall strength, we can break it up into how strong they are at home versus how strong they are on the road. Now we will have something that looks like:

Pr( team \(j\) beats team \(k\) on team \(j\)'s court) = \(1/(1+e^{-(\beta_j+\gamma_j-\beta_k)})\)

Pr( team \(j\) beats team \(k\) on team \(k\)'s court) = \(1/(1+e^{-(\beta_j-\beta_k-\gamma_k)}).\)

When we fit this new model, \(\beta_j\) tells us how strong team \(j\) is on the road, and \(\beta_j+\gamma_j\) tells us how strong team \(j\) is at home. The estimated \(\gamma\) essentially tells us how much of a boost each team gets when playing on their own court.

The Results

I used the Generalized Regression platform in JMP Pro to help us fit our model. I used Forward Selection for estimation since that will help us separate the good (\(\beta_j>0\)) and bad (\(\beta_j<0\)) teams from the average (\(\beta_j=0\)) teams.

So let’s get to it: Here are our top 10 teams when playing at home and our top 10 teams when playing on the road.


One thing is clear from our model: Golden State was clearly the best team regardless of where the game took place (so it’s no surprise that they won the NBA title). But the Warriors were especially dominant at home; their home strength is head and shoulders above the pack. Let’s look at an example of how we can use these strengths to predict win probabilities:

Pr( Golden State wins at home versus LA Clippers) = 1/(1 + \(e^{-(2.795-1.096)})\)

\(= .85\).

So the Clippers were the best road team other than the Warriors, yet our model suggests that they only had about a 15% chance of winning at Golden State. The Warriors were an incredible 48-4 at home (39-2 in the regular season and 9-2 in the playoffs) and our model clearly reflects that.

And, of course, we have to look at the worst teams as well. This time, we don’t even have to differentiate between home and away. It turns out that our bottom teams struggled equally regardless of where the game took place.


This list shouldn’t surprise anyone who followed the NBA last year. But being on the list of worst teams does have a silver lining: It gives your team a better chance of improving in the draft. The Minnesota Timberwolves got to choose first in the 2015 draft and took Karl-Anthony Towns, who has all the makings of a superstar. Just out of curiosity, let's see what kind of chance the worst road team (the New York Knicks) has of winning at Golden State:

Pr( Knicks win at Warriors) = \(1/(1+e^{-(-1.354 - 2.795)})\)

\(= .015.\)

So the Knicks had less than a 2% chance of winning at Golden State, which makes for an even tougher trip across the country.

Finally, there is one more set of teams that caught my eye, the teams that were strong at home but average on the road.

Portland really sticks out to me: It's the fourth strongest team at home, yet average on the road. I find it interesting that three of these four teams are from the Western Conference, which is generally accepted as being much stronger than the Eastern Conference. Road games are especially tough when you are playing Western Conference teams like Golden State, San Antonio and Houston.

As a statistician and a basketball fan, I enjoyed looking at these data. It was especially nice to see that my two favorite teams, the Atlanta Hawks and Memphis Grizzlies, were near the top of the rankings. Hopefully, they can stay near the top throughout the upcoming season.

Post a Comment

Recoding data to explore the popularity of Halloween costumes

Dogs dressed as superheroes for Halloween.

The NRF survey asked what people will dress their dogs as for Halloween. Batman and Superman made the list of responses. (Photo courtesy of Kerrie Merullo Wilson)

With Halloween right around the corner, it's time to decide what costume to wear.

The National Retail Federation did a survey to find out the popular costumes this year, and I thought it would be fun to explore and visualize the results of that survey.

The survey asked three questions:

  • What will your costume be for Halloween this year?
  • What will your child/children dress up as for Halloween?
  • What will you dress up your pet as for Halloween?

I imported the data into JMP, and I found that the data required some tidying.

Fortunately, JMP 12 has a newly designed Recode feature, which allows you to clean up your data more efficiently.

At first, the imported data looked like this:


Notice there are unwanted numerals in the Costume column, white space and missing values in the data table.

I recoded the Costume column to remove the extra characters by highlighting it, and selecting Cols > Recode. I used the Trim Whitespace option from the red triangle menu to get rid of the whitespace before and after each value.

I also used the Filter search bar to search for any numbers that I didn’t want to include in my recoded data table. By entering “1” in the search bar, every value containing a “1” is grouped at the top. After deleting the unwanted characters in the New Values column, the old values and the new values are grouped together and appear shaded. When working with a lot of data, I use the Show only Grouped/Ungrouped check boxes to help control my view.


The Group Similar Values red-triangle menu option is also a nice way to organize and tidy data — especially when checking for consistency. There are values that appear multiple times in this data table, but they have different spacing or an extra letter (for example, “Batman” and “Bat man”). I wanted to find those values and recode them so they are consistent throughout the table. The Difference Ratio and Max Character Difference options automatically group values together that differ by just a couple characters (depending on your settings). This makes it easy to find mistakes or inconsistencies. I kept the default Difference Ratio value of .25, which grouped values that are at most 25% different — in other words, values that have at least a 75% character match are grouped together.


Here’s what the data looks like after grouping similar values:


Now I can easily see the grouped similar values. I edited each group so that every instance of a given value looks the same. For instance, I changed the new value of the “Star Wars Character” group so that each of the three instances have the same spacing. After making the appropriate changes, I selected Done > In place. This way, the values in the New Values column will replace the old values in the data table. To preserve the original data, save the changes made in Recode to a new column by selecting Done > New Column or Formula Column.

Here's what the data looks like in Graph Builder when analyzing costume by percent:


Note the rows containing “Other” have been excluded and hidden.

Also, notice that the plot above is cluttered and hard to analyze. I recoded the Costume column again to organize the costumes into categories in order to make it easier to find patterns in the data. Once the Costume column was in Recode, I selected the values to group and used the right-click option, Group To. Because there are many animal costumes, I grouped them and made an “Animals” category.


After grouping all the animals under “Animal (Cat, Dog, Lion, Tiger, etc.)”, I shortened the name of the category to “Animals”. I grouped the remaining costumes under categories such as, “Superhero”, “Fantasy”, “Scary”, etc. After recoding, I selected Done > Formula Column to preserve my original costume column. I named the new column “Categories”. You can view the formula by double-clicking on the column header.

Here’s what the table looks like with the Categories column:


The Analysis
Now that the Costumes are binned into categories, Graph Builder provides a more interpretable plot. In Graph Builder, I used the Percent column for the X variable and Categories for the Y variable. I grouped the data by Child, Adult and Dog, and ordered it by popularity. Here’s the result:


The above graph shows the percent of costume choices by category for each group (Adult, Child, Dog). I can see which costume category is the biggest hit among adults, kids, and dogs. It appears that the Fantasy category was the most popular for adults, the Animals category for kids, and the Object category for dogs.

To further analyze the data, I used the Local Data Filter to view just the adult costume choices. Here’s what the data looks like filtered by adults and sorted by popularity:


The Witch costume is the most popular among adults this year.

Now I’ll examine which specific costume is the most popular among kids and adults. Again using Graph Builder, I analyzed the proportion of people who chose a particular costume. Because the data did not include a count for dog costumes, I created a new column called Group containing only “Adult” and “Child” before I ran the analysis. The entries for the Dog rows were missing values. Here’s a mosaic plot to illustrate the results:


The vertical axis indicates which proportion of the Costume column falls into the Child or Adult group. The overall size of each bar indicates which costumes are popular for both kids and adults. The graph shows about 75% of those who chose the witch costume were adults. About half of those who chose an animal costume were adults and half were kids. Some costume groups, such as Princess, are completely dominated by one group. In this case, kids make up the total proportion of people who chose princess as their costume.

For additional fun, I grouped the data by category using the Fit Y by X platform to create another mosaic plot. Here’s what the plot looks like:


I can see that only kids chose costumes from the Object category, and only adults chose costumes from the Occupation category. When comparing overall bar size, Fantasy, Scary and Superhero are the most popular categories for both adults and kids. I can observe the same findings in one of the bar graphs mentioned previously, however with the mosaic plot (above), I can compare costume popularity across both groups more easily.

I wish everyone a fun and safe Halloween! Look out for witches — apparently we can expect to see lots of them this year!

What will you be for Halloween?

Post a Comment

How we can use technology to preserve biodiversity

You can try to hide your body, but you can’t hide your footprints. That's where data collection and technology can help.

You can try to hide your body, but you can’t hide your footprints. That's where data collection and technology can help.

Most of us live in a cosseted unreality, cocooned inside thermo-regulated cars and buildings. Our food and drink is on tap, and we experience little inconvenience beyond death and taxes.

It’s easy to forget that life as we know it would come to an abrupt halt without the millions of other species on earth toiling away behind the scenes, to provide our fresh air, water, soil, medicine and the contents of our fridges.

Given this, one might think we’d have been super-careful to balance the exploitation of our home planet to ensure a place for this biodiversity.

Yet evidence shows that we’re driving other species to extinction at rates 100 to 1,000 times the natural "background" rates (that is, the standard rate of extinction in earth’s history before humans became a primary contributor to extinctions).

So what’s to be done? 

We need a more complete understanding of the factors that threaten endangered species and the resources that they need to survive. This is still woefully incomplete, even for some of the best-studied species.

Our paper Emerging Technologies to Conserve Biodiversity just published in the journal Trends in Ecology and Evolution examines the role that new technologies might play in redressing the balance – from ever-smaller individual tracking devices to a future of completely non-invasive approaches, such as  sophisticated camera traps, spying on nature with high-resolution satellites and the deployment of drones.

But technology alone will not resolve the challenges we face, and our resources are limited.

The bigger question is this: How do we deploy technology to maximum effect? In our paper, we consider issues that require priority attention if technologies are going to achieve full potential, including the application of technology for critical issues, such as the illegal trade in wildlife, and building a much wider engagement of citizen scientists and tourists to help gather data. These groups, previously disenfranchised from conservation, could exponentially increase our knowledge through wider data collection.

WildTrack transferring technology from the lab to the field: Field biologists from Southern Africa, Brazil, Australia, the UK and the USA gather in Namibia to learn how to image footprints.

WildTrack transferring technology from the lab to the field: Field biologists from Southern Africa, Brazil, Australia, the UK and the USA gather in Namibia to learn how to image footprints.

The challenges of more data

Unprecedented volumes of data bring new challenges. This might be where we can learn from the corporate world. Companies such as SAS have considerable expertise in big data management and visualization across sectors.

We can employ algorithms to select data that matter. Data collection becomes quicker and easier as computational power increases and hardware size decreases. Our ability to predict outcomes improves with new probabilistic models.

It turns out that many potential solutions are synergies of components from disciplines outside conservation biology, such as engineering, statistics, mathematics, other sciences and business.

WildTrack and NCSU Mechanical and Aerospace Engineering are collaborating on a project to develop mobile and stationary UAV networks for conservation monitoring.

WildTrack and NCSU Mechanical and Aerospace Engineering are collaborating on a project to develop mobile and stationary UAV networks for conservation monitoring.

A last twist to this tale

New technologies have allowed us to quantify conservation value where we least expected it – in ancient techniques that developed as we humans evolved. For the last few decades, we’ve embraced invasive, bulky and costly tags and collars as state-of-the-art tools for monitoring endangered species.

Now with advances in digital imaging and statistical modeling, we’re able to distill the millennia-old art of animal tracking into a new technology for identifying species, individuals, sex and age-class from footprints: the footprint identification technique (FIT) using JMP software from SAS.

FIT in JMP for the Amur tiger.: Using FIT, we were able to do the first small-scale footprint census of free-ranging Amur tigers in North East China.

FIT in JMP for the Amur tiger: Using FIT, we were able to do the first small-scale footprint census of free-ranging Amur tigers in North East China.

FIT is an apt symbol of the revolution we need to effect. We must put together the big picture from the individual pieces of technology. We can foster global participation across cultures and disciplines, to create brave new combinations of tried and trusted tradition, with inspired innovation.

Until then, the bushmen, with their time-honored skills, might just have the last laugh.

Editor's note: A version of this blog post first appeared in SAS Voices.

Learn more about WildTrack's FIT application and how new technologies are helping with conservation.

Post a Comment

The QbD Column: Mixture designs

Scientists in the pharmaceutical industry must often determine product formulations. The properties of a formulation, or mixture, are usually a function of the relative proportions of the ingredients rather than their absolute amounts. So, in experiments with mixtures, a factorʹs value is its proportion in the mixture, which must fall between zero and one. In some cases tighter constraints are needed. For example if, as a minimum, 10% of component X1 is necessary to achieve a reaction, we must specify X1>0.10. In all cases, the sum of the proportions in any mixture recipe is one.

Designs for mixture components are fundamentally different from other experiments. In most experiments, the setting of one factor varies independently of any other factor. Thus, for example, the design can guarantee that factors are orthogonal to one another. With mixtures, it is impossible to vary one factor independently of all the others. When you change the proportion of one ingredient, the proportion of one or more other ingredients must also change to compensate. This simple fact has a profound effect on every aspect of experimentation with mixtures: the factor space, the design properties, and the interpretation of the results.

JMP provides several tools for designing mixture experiments. The DOE platform includes a mixture designer that supports experiments in which all the factors are components in a mixture. It includes several classical mixture design approaches, such as simplex, extreme vertices, and lattice. For the extreme vertices approach it also incorporates linear inequality constraints that permit limiting the geometry of the mixture factor space to feasible experimental regions. Many mixture experiments also include process variables, such as the temperature or the mixing time. The JMP Custom Design tool is ideal for these more complex settings.

What is a ternary plot?

Mixture designs have an interesting geometry. With three factors, the factor level combinations are presented in a ternary plot, which is a two-dimensional representation of three components. Figure 1 presents a ternary plot for three mixture combinations, each made from three components. The mixtures represented are:

  1. [0.1, 0.1, 0.8]
  2. [0.1, 0.8, 0.1]
  3. [0.8, 0.1, 0.1]

To determine the component values on the plot, lines parallel to each edge of the triangle lead to scales giving the values of the components representing the perpendicular projections. For example, the concentrations of the components for point 1 are 0.1 for X1, 0.1 for X2 and 0.8 for X3 (these values are marked as circles on Figure 1).

Ternary Plot

Figure 1: Ternary plot of mixtures with three components.

Figure 1: Ternary plot of mixtures with three components.

With more than three components, the feasible region for the factor settings in a mixture design takes the form of a simplex. There are a number of ways to set up ternary plots to show the mixtures used in the design. The default in JMP is to show plots for each two-component projection; the third axis in each plot shows the fraction of the mixture devoted to all the other components. Figure 2 shows such a set of plots for an experiment with 4 components. There is one plot for each of the six two-factor projections. The lower-right vertex is always labeled “Others” and corresponds to the combined amount of the other two components.

  1. [0.1, 0.1, 0.1, 0.7]
  2. [0.1, 0.7, 0.1, 0.1]
  3. [0.1, 0.1, 0.7, 0.1]
  4. [0.7, 0.1, 0.1, 0.1]


Figure 2: Ternary plots of mixtures with three components.

Figure 2: Ternary plots of mixtures with three components.

How do we characterize a mixture design space?

In order to build a design space for mixture data, we use an example from Wu et al. (2009)[1]. In this experiment, the focus is on the mixing of powders. Key questions for the blending operation include the following:

(1) How to quantify components of powder blends simultaneously?

(2) How to validate or confirm the process analytic technology (PAT) blending process monitoring results via other fast and convenient spectroscopic methods?

(3) How to link the scale of scrutiny and the homogeneity of both API and excipients?

Wu et al. used an extreme vertices design with four components to compute the formulation compositions with the following constraints applied to the weight fractions of corresponding formulation components: for Ibuprofen, 0.25≤wt. fraction≤0.75; for HPMC, 0.01≤wt. fraction≤0.03; for MCC: 0.19≤wt. fraction≤0.57; for Eudragit L 100-55: 0.05≤wt. fraction≤0.15. The JMP Mixture Design platform includes a “linear constraint” option where it is easy to enter the above information. This option can also handle constraints that involve several factors simultaneously, a situation that arises in some mixture designs. The experimental array is presented in Table 1 below with the cells colored by experimental levels (a feature found in JMP column properties).

Table 1: Extreme vertices design of four components powder mixture experiments

Table 1: Extreme vertices design of four components powder mixture experiments

Bulk density: 0.35-0.43 (g/ml)

Tap density: 0.4-0.6 (g/ml)

Mmin: 2.5-3.0 (mg)

Mmax: 13-15 (mg)

Figure 3: Ternary plots of powder mixture experiments.

Figure 3: Ternary plots of powder mixture experiments.


Figure 4: Model used in analysis of powder mixture experiments.

Figure 4: Model used in analysis of powder mixture experiments.

Figure 5 is a table representing the analysis of Tap density. From the table, we determine that none of the quadratic effects has a significant effect onTap Density. We ran a second analysis, using only the main effects of the four mixture components. Figure 6 shows the results, with statistically significant effects of all four components.

Figure 5: Sorted parameter estimates on Tap density of model presented in Figure 4.

Figure 5: Sorted parameter estimates on Tap density of model presented in Figure 4.


Figure 6: Sorted parameter estimates on Tap density from the first order model.

Figure 6: Sorted parameter estimates on Tap density from the first order model.

Prediction Profiler

The Profiler in JMP provides a quick visual summary of how the components affect all four responses (see Figure 7, from the first order model). The Profiler shows much stronger effects for Ibuprofen and MCC than for Eudragit and HPMC. This is because Ibuprofen and MCC have a much wider range than the other factors.

Figure 7: Profiler based on first order model used in analysis of powder mixture experiments.

Figure 7: Profiler based on first order model used in analysis of powder mixture experiments.

We present a contour plot of the responses, based on the ternary representation of Ibu, MCC and Eudragit, in Figure 8.

Mixture Profiler

qbd_mixture profiler1






Mixture Profiler

qbd_mixture profiler2


Figure 8: Contour plot and design space based on model used in analysis of powder mixture experiments. Current point listed above is marked with a circle.

Figure 8: Contour plot and design space based on model used in analysis of powder mixture experiments. Current point listed above is marked with a circle.

Using the stochastic emulator approach presented in the previous blog post, we derive the Profiler setup presented in Figure 9. In this implementation, the mixture constraints are not enforced so that we need to ensure that simulations are within the design space. In Figure 7, the constraints naturally satisfy this requirement. In other cases, JMP provides an option to perform such simulations.

Prediction Profiler

Figure 9: Effect of variability of mixture components on model used in analysis of powder mixture experiments.

Figure 9: Effect of variability of mixture components on model used in analysis of powder mixture experiments.

This analysis presents the impact of variability of the components on the responses. With the set up of Ibu=0.5, MCC=0.38, Eudragit=0.1 and HPMC=0.002, and the variability structure described in Figure 8 (means are at set up points and variability with normal distributions and standard deviations determines by experimental range), one gets an overall defect rate of 31%. The Mmax and Mmin responses generated by these simulations have respective means and standard deviations (in brackets) of 14.77 (0.24) and 2.54 (0.04). These two responses induced failure rates of 16% and 14% respectively. In the simulation experiments used in Figure 9, the four factors (components) were sampled independently from their specific variability distributions. JMP also makes it possible to include a correlation structure between the sampled values.

Optimizing on minimal defect rate, as shown in our previous blog post, will lead to a robust formulation around required target values.

Next in The QbD Column

The next blog  post will be dedicated to sequential experimentation strategies by specifically considering a screening design followed by a central composite design.


[1] Wu, H., Tawakkul, M., White, M., Khan, M.. (2009), Quality-by-Design (QbD): An integrated multivariate approach for the component quantification in powder blends, International Journal of Pharmaceutics 372; 39–48.

About the Authors

This blog post is brought to you by members of the KPA Group: Ron Kenett, Anat Reiner-Benaim and David Steinberg.

Ron Kenett

Ron Kenett

Anat Reiner-Benaim

David Steinberg

David Steinberg

Post a Comment

It's a dirty job, but somebody has to do it: Data cleaning with JMP

For anyone doing data analysis, it’s common to devote a large proportion of time to data cleaning. Real data is dirty data. Even if you are using automated collection tools, errors, anomalies and inconsistencies will still make their way into your data.

Over the next few blog posts, I’d like to show you some simple techniques for improving data integrity. I’ll talk about what to look for, and how to do it using JMP. In many cases, JMP has all the capabilities you need to perform the task. In a few cases, I’ve put together scripts to help with the cleaning. I’ll make these scripts available in the JMP User Community.

To put data cleaning into context, consider three general places where problems might occur. The first place you may run into issues is while building your data set. Missing delimiters, mismatched column names, and out-of-sync time stamps will make accessing, concatenating and joining data difficult. Once the data is in a JMP data set, you will run into a different set of problems: Incorrect data or modeling types, columns with no values, error codes and misspelled values are common. Finally, in the analysis process, you may find that you need to remove outliers, transform columns or aggregate variable levels.

Let’s start in the middle, the second area mentioned above. You’ve created your data set, but you need to clean it up before analysis. I suspect that this is where you’ll encounter most of your problems. To begin, we’ll focus on removing unneeded columns and getting the data and modeling types correct. Much of this can be accomplished with just the Columns Viewer.

To illustrate, I’ll use data I downloaded from the US Environmental Protection Agency’s website for automobile fuel economy. Seven separate Excel worksheets (one for each year in the data set) were downloaded and concatenated. You can find the data in the JMP User Community.

The first thing we can do is remove variables with little to no information. Columns where all values are the same (or, missing) are ideal candidates. Using the Columns Viewer (Under the Cols menu):

  1. Filter for only the categorical values by unchecking Continuous. (This is the same column editor that can be found in all platform dialog boxes.)
  2. Select all of the columns and click Show Summary.
  3. Right-click in the table under Summary Statistics and choose Sort by Column …
  4. Select N Categories and check the Ascending

Any columns you select in the summary table will be simultaneously selected in the parent data table. This makes many column operations easier, by allowing you to preselect columns with particular properties before performing the operations.

Columns Viewer filter

Figure 1. The Columns Viewer filter can be used to quickly identify data table columns with specific properties.


Selecting columns from viewer

Figure 2. Columns selected in the Columns Viewer summary window are also selected in the data table.

Notice that the top three columns have all missing values and can be removed. Once you’ve selected them, Cols > Delete Columns will remove them from the data table. Note that the deleted columns will not be removed from the summary table, so it may make sense to rerun Show Summary (remember to reselect all the columns). You can tidy up the report by removing the old summary table (look under the hot spot next to the Summary Statistics title).

The next 10 columns have only one value. The decision to keep them depends on the proportion of missing values they have and whether you think their identification provides additional information. Two of the columns, Suppressed? and Police/Emerg? have no missing values (i.e., all the rows are identical), so they can be safely removed. Seven of the remaining eight columns are mostly missing (>99.997%) and may be good candidates for elimination. The remaining column Guzzler? has about 5% non-missing and could be worth keeping. (I’ll assume none of the columns are part of a variable that has been split across multiple columns. I’ll talk about finding and fixing this situation in a later post.)

In the single-category examples above, the majority of the data was missing. If, however, most of the data was nonmissing, a similar argument can be used to justify when to remove a column. If a column is made up of a single value, but has a large enough proportion of missing values, you may consider keeping it. Where those missing values are might be telling you something.

We can also remove multi-category columns with a high degree of missingness. To do this, right-click and sort the summary table by N Missing. Don’t check the Ascending box this time, since you only want to consider columns whose values are largely missing. Because these columns have more than one category, you may also want to examine them in more detail. To do this, select them in the summary table and click the Distribution button. As before, you’ll have to decide what proportion of nonmissing values justifies retaining the column.

When you are done with the categorical data, you can perform the same operations with the continuous data. In the filter pop-up dialog, reselect Continuous and deselect Ordinal and Nominal (remember that holding the Alt key before clicking the hot spot lets you make multiple selections). After displaying the summary, sort ascending by Std Dev (it’s on the far right). The columns where Std Dev equals zero only have a single value. There should be only one (Comb CO2 Rounded Adjusted). Sort by N Missing to find those columns whose values are mostly missing.

Once you’re done removing unneeded columns, you can move on to checking the data and modeling types. When JMP initially reads in data from other sources, columns containing only numbers are given the Numeric data type and Continuous modeling type. Otherwise, they are assigned the Character data type and Nominal modeling type. Occasionally, JMP will set a column to Numeric even when it has a few character values. If this happens, you will get a warning message like the one in Figure 3.

Warning - Set values to missing

Figure 3. Warning message when character values are set to missing.

While the defaults in JMP work well in most cases, there are some situations you should know about. First, if a numeric column has extraneous text — “NA”s, error codes, symbols to flag unusual observations, etc. — the column may be read as Character/Nominal. Second, numeric values may correspond to coded values and not measured values. In these situations, you’ll want to treat the modeling type as Nominal. Finally, in some situations measured categorical values should be treated as ordinal. Fortunately, there are relatively few instances where there is a difference between modeling with Nominal data, and modeling with Ordinal data.

So we can keep working with the Columns Viewer; our next task will be finding categorical numeric columns. (Our other tasks will have to wait for the next post.)

To locate these columns:

  1. Create a summary of all the continuous columns in the Columns Viewer.
  2. Under the Summary Statistics hot spot, select Data Table View.
  3. Select all the Numeric columns and change their modeling type to Ordinal. I like to use Ordinal because most data will initially be Continuous or Nominal. Making it Ordinal lets me know that it’s Numeric, but I’m considering making it categorical. Make sure all columns in the Columns View Selector are selected, and click Show Summary. If you have an empty Columns View Selector box, turn Ordinal back on in the Columns Viewer filter first.
  4. Right-click within the summary and choose Make into Data Table. Since you already have a Data View, you don’t need to create another.
  5. Update the table from Step 2 with the table from Step 4. Match by the column named Columns.
    • Select the table you created in Step 2. Go to Tables > Update
    • Select the table you created in Step 4 from the list box and check the Match columns
    • Assuming you haven’t changed any column names, select the columns Columns from each table and click Match. Your results will be similar to Figure 4.
    • Click OK.
  6. Sort by N Categories.
Table Update

Figure 4. Using Update with matching columns.

As with the summary table in the report window, the resulting table is linked to the data table from which it was created. Because you’re using a Data View, you may need to deselect any columns that may have been previously selected in the parent data table. To find categorical variables, look for those columns that have a small number of categories and/or have a Min value close to 0.

While this approach to finding categorical numerical data is far from perfect, it is a quick and easy start. You can perform a simple check by selecting rows in the summary data table, returning to the parent table, then running Distribution on the selected columns. If you do this with columns having N Categories equal to 6 or less, you’ll notice that a few of the columns contain values with decimals. While it would have been nice to have excluded these columns from the start, Columns Viewer doesn’t currently provide for this, so we’ll do this with a bit of scripting.

We’ll talk about that and more next time.

Post a Comment

Rocket science where you may not expect it

TomLangeFluid dynamics of peanut butter, biomechanics of diapers, aerodynamics of packaging potato chips, precision engineering to produce and perfectly perforate paper towels — rocket science is applicable to the manufacturing of these and many other products in ways that may surprise you.

Achieving desired product attributes can be challenging, especially when some of those traits would seem at odds with each other. Materials need to be strong but also soft, flexible but not breakable, breathable but containable, easy-to-open but not prone to leak, etc.

Tom Lange, a 37-year veteran of Procter & Gamble, will be the guest on Analytically Speaking on Oct. 14. As founder of the modeling and simulation group at Procter & Gamble, from which he recently retired, he has many amazing stories to tell about the relevance of rocket science to everyday products. His passion for problem-solving led to the formation of his consultancy, Technology Optimization & Management, LLC, where he's been busy with new applications of rocket science in unexpected places.

Need more reasons to join us? Tom knows a lot about how to create and sustain a culture where data analysis is an integral part of the daily workflow. Tom is both technical in his knowledge of applied modeling and simulation, as well as strategic in leveraging those applications for maximum impact.

To hear about Tom’s scientific approach to solving problems, please join us for the live webcast or watch the archived version later.

Post a Comment

Potato chip smackdown: A generalized regression analysis

Four rows of three potato chips

What if we analyzed the data using Generalized
Regression? Would the results be any different from those we got using Choice Analysis?

At this year’s JMP Discovery Summit in San Diego, there were plenty of fantastic talks. After completing the tasting portion of the potato chip smackdown, one that stuck out for me was my colleague Clay Barker’s presentation on using the Generalized Regression personality in the Fit Model platform in creative ways. In particular, he showed how to fit a Bradley-Terry model to compare the relative strengths of basketball teams by looking at the win-loss performance for each pair of teams. After seeing Melinda’s analysis using the Choice Analysis platform, I thought it would be interesting to try the Generalized Regression platform and see how the results compare.

Data Preparation

To try this, I needed to set up a new data table where each pair that was compared together in a choice set appears in a row, one with a value of 1, and the other a -1. Then I needed columns for the number of times each pair was compared, and the number of times the flavor with the value of 1 was chosen over that with a -1. Note that I didn’t need an additional row with the values of +/-1 reversed.

If you read the experimental setup, you may recall that we didn’t test the chips in pairs, but three at a time. To do this analysis, I split up each choice set into the three pairs that occur within it, since I can determine the winner and loser from each based on having the best and the worst from each choice set.

I found it easiest to set it up the same way that Clay did with the basketball data – set values of 1 for the first flavor, then put in values of -1 for one flavor at a time with the remaining flavors to the right, moving down the rows one at a time (for those of you who like to think of matrices, you get something that looks like a negative identity matrix). Just to give you an idea of what this looks like, the first few rows of my data table look like this:


You can take a look at the full data table on the File Exchange. I didn’t create a nice script to collect the data in the format I wanted – it was a matter of using table summaries and manually inputting the times chosen and compared for each pair.

The Analysis

Now that I have the data table I wanted, it’s off to Fit Model. The times chosen and times compared are selected for the Y variables, and I add all the chip flavors as effects with No Intercept selected (based on the Bradley-Terry model). With a Generalized Regression and binomial distribution (we’re looking at the probability that one flavor gets picked over the other), I had:


I chose forward selection with AICc validation, although BIC gives the same results. The chosen model looks like this:


Or visually:


Using the pairwise Bradley-Terry model, BBQ and Southern BBQ come out on top, followed by Southern Biscuits and Gravy and Truffle Fries (note that these positions switched from the Choice model, but the estimates are close). For the remaining flavors, everything else was grouped together, except for poor Greektown Gyro, which wasn’t very popular among the JMP group.

Final Thoughts

It wasn’t all that surprising to me that the results came out similar to the choice modeling Melinda used, but I thought it was interesting to use the generalized regression approach to see which flavors separate themselves from the pack. Alas, the Canadian flavors didn’t hold up, but at least they weren’t the worst.

I think there’s still more to the story: The Bradley-Terry model can be extended to three comparisons at a time, and there may be more to the story with choice modeling. This is likely to be revisited in a future blog post -- or perhaps a future Discovery Summit. Thanks for reading!

Post a Comment

The QbD Column: Achieving robustness with stochastic emulators

In an earlier installment of The QbD Column titled A QbD factorial experiment, we described a case study where the focus was on modeling the effect of three process factors on one response, viscosity. Here, we expand on that case study to show how to optimize process parameters of a product by considering eight responses and considering both their target values and the variability in achieving these targets. We apply a stochastic emulator approach, first proposed in Bates et al, 2006, to achieve robust on target performance. This provides additional insights to the setup of product and process factors, within a design space.

A case study with eight responses and three process factors

The case study refers to a formulation of a generic product designed to match the properties of an existing brand using in vitro tests. In vitro release is one of several standard methods used to characterize performance of a finished topical dosage form (for details see SUPAC, 1977).

The in vitro release testing apparatus has six cells where the tested generic product is compared to the brand product. A 90% confidence interval for the ratio of the median in vitro release rate in the generic and brand products is computed, and is expressed as a percentage. If the interval falls within the limits of 75% to 133.33%, the generic and brand products are considered equivalent.

The eight responses listed in the SUPAC standard that are considered in setting up the bioequivalence process design space are:

  1. Assay of active ingredient
  2. In vitro release rate lower confidence limit
  3. In vitro release rate upper confidence limit
  4. 90th percentile of particle size
  5. Assay of material A
  6. Assay of material B
  7. Viscosity
  8. pH values

Three process factors are considered:

  1. Temperature of reaction
  2. Blending time
  3. Cooling time

The experimental design consisted of a 23 factorial experiment with two center points. The experimental array and the eight responses are presented in Figure 1.

Figure 1: Experimental array and eight responses in QbD steroid lotion experiment.

Figure 1: Experimental array and eight responses in QbD steroid lotion experiment

Fitting a model with main effects and interactions to the three process factors

Figure 2 shows the model we fitted to the data. It included main effects for each of the three factors and all the two-factor interactions. We used the same model for all the responses.

Figure 2: Model used to fit the QbD experimental data on all eight responses

Figure 2: Model used to fit the QbD experimental data on all eight responses

Assessing operating conditions with the Profiler

The design was centered at the current “best guess” for operating conditions, with Temp=67.5, Blending Time=3.5 and Cooling Time=105. The fitted models at these values give an overall desirability index of 0.30 (for a description of desirability functions, see the second blog post in our QbD series titled A QbD factorial experiment). The resulting solution increases the Temperature to 75, reduces the Blending Time to 2 and sets the Cooling Time at 113.1, with a desirability of 0.54. Overall, our goal is to reduce defect rates.

If production capability allowed us to set the factor levels with high precision, the above solution would be a good one. However, there is always some uncertainty in production settings, with corresponding variation in process outputs. We now study the effect of this uncertainty, using computer simulations to reflect the variation in the design factors. We use the “simulator” option that is available with the Profiler in JMP.

The simulator opens input templates for each experimental factor and for each response, where we describe the nature of variation. For example, we set up the temperature to vary about its nominal value with a normal distribution and a standard deviation of 3 degrees. Describing variation in the outputs allows us to reflect the natural variation in output values about the expected value from the model. The standard deviations for the outputs could be the residual standard deviations from fitting the models. Clicking on the “simulate” key generates data for each output that is characteristic of typical results in regular production.

Figure 3 (below) shows simulated results based on our QbD experiment when all factor are set to their center values. The summary also shows that more than 80% of the simulated results do not meet process requirements and that these “out of spec” problems are all due to having an in vitro upper confidence limit that is too high.

The results in Figure 3 show that the center point is not a good operating condition and will have a high fraction of production out of spec. Moving the settings to maximize the desirability reduces the percent out of spec to 18%.

Using the simulator to find better operating conditions

We now take a further step and show how to use the simulator to find robust operating conditions that maintain high desirability and also reduce the percent out of spec. This can be done directly with the Profiler, moving the operating conditions with the slider bars and observing the results for desirability and percent defect. We present here a much more systematic approach based on a “computer experiment” to model the defect rate over the factor space.


Figure 3: Assessing production at the center point from the QbD steroid lotion experiment with the JMP profiler and the simulator option

Figure 3: Assessing production at the center point from the QbD steroid lotion experiment with the JMP Profiler and the simulator option

Running experiments on computers

The simulator option in the JMP Profiler allows us to compare operating conditions via computer generated values that represent the natural production variation about target settings for the design factors. We continue our study, exploring how the factors will affect production by choosing a collection of points in the factor space at which to make these evaluations.

Such computer experiments, as opposed to physical experiments, typically use “space-filling” designs that spread out the design points in a more-or-less uniform fashion. Another characteristic of such experiments is that the data does not involve experimental error since, rerunning the analysis on the fitted model, at the same set of design points, will reproduce the same results.

In modeling data generated from computer experiments, different approaches are used, with Gaussian process models (also known as Kriging models) being the most popular. A key publication in this area is Sacks et al. (1989), which coined the term design and analysis of computer experiments (DACE).

Assessing the QbD study by a computer experiment

Our computer experiment for the formulation experiments used a 128-run space-filling design in the three experimental factors (the default design suggested by JMP). At each of these factor settings, the production variability is simulated, giving computer-generated responses as seen in Figure 3. Key indicators are used to summarize the responses at each setting. Then, we model the key indicators to see how they relate to the factor settings and to suggest optimal settings. Here, we focus our analysis on achieving high desirability and low defect rate.

Optimizing desirability of performance and defect rates

Maximizing the overall desirability, as shown earlier, leads to a new setup of the process factors (Temp=75, Blending Time=2, Cooling Time=113.1) with an overall desirability of 0.54 and an overall defect rate of 0.18. A 128 point space-filling experiment on the model, with no added errors on the responses, produced the data shown in Figure 4.

Figure 4: Results from space filling simulated experiments using model used to fit the QbD steroid lotion experimental data (the stochastic emulator data).

Figure 4: Results from space filling simulated experiments using model used to fit the QbD steroid lotion experimental data (the stochastic emulator data)

The Gaussian process model automatically generated by JMP is a natural tool for analyzing the data, and it opens in the data table of the computer experiment outcomes. The fitted model provides an emulator of the defect rate derived from adding production noise to the factor settings from the original model that was fit to the data. Since minimal is best in a defect rate, the optimization of this emulator model leads to a Profiler set at values minimizing defect rates. To apply these proposed values to the Profiler of the original model shown in Figure 3, one needs to link the profiles in the factor setting options. In Figure 5 we present the optimized Profiler on defect rate using the Gaussian model for fitting the stochastic emulator model derived from data generated from the original QbD model.

Figure 5: Optimized Profiler (minimizing defect rate) of stochastic emulator data.

Figure 5: Optimized Prediction Profiler (minimizing defect rate) of stochastic emulator data.

The stochastic emulator

The optimization and robustness analysis above follows the “stochastic emulator” approach proposed by Bates el al. (2006). The approach is useful whenever there is a model that can describe input-output relationships and a characterization of sources of variability. The analysis can then be used to optimize a response by accounting for both its target values and variability. The stochastic emulator is used to model the variability in the data and, combined with optimization of a model fit to the physical experiments, allows us to ensure, as best as possible, both on-target performance and minimal variability. The key steps of the stochastic emulator approach are as follows:

  1. Begin with a model that relates the input factors to the system outputs. The model could be derived from the results of an initial laboratory experiment, as in our example here, or it could be derived on purely theoretical grounds from the basic science of the system.
  2. Characterize the uncertainty in the system. Describe how the input factors are expected to vary about their nominal process settings. The corresponding distributions are called noise distributions for the input factors. Describe the extent of output variation about the values computed from the model.
  3. Lay out an experimental design in the input factors that describes possible nominal settings. As noted earlier, space-filling designs are the popular choice here.
  4. Generate simulated data from the noise distributions at all the nominal settings in the space-filling design.
  5. Summarize the simulated data at each nominal setting by critical response variables (like desirability and defect rate in our study).
  6. Construct statistical models that relate critical response variables to the design factor settings using the Gaussian process model option in JMP.
  7. Optimize the choice of the factor settings for all critical outcomes. Here we want the process to have both on target performance and robustness (JMP allows us to do this by linking and optimizing Profilers).

Some concluding remarks

When we run the simulation on the noise in the design factors (Figure 3) at the set points achieved in Figure 5, (Temp=75, Blending Time=2, Cooling Time=150.6), we achieve an overall desirability of 0.53 and a defect rate of 0.09. This setup is slightly worse than the setup optimized only on performance relative to target but significantly better in terms of robustness.

Figure 5 suggests the possibility of achieving further reduction in the defect rate by moving the temperature and cooling time to higher levels and the blending time to lower levels. We set the factor ranges in our computer experiment to match the ones used in the experiment (but limited to values between the original operating proposal and the extreme experimental level in the direction we found best for robustness). We did not want to make predictions outside the experimental region, where our empirical models might no longer be accurate. In the context of QbD drug applications, extrapolation outside the experimental region is not acceptable.

In cases like this, it is often useful to carry out further experiments to explore promising regions. Extending the factor ranges in the directions noted above would let us model the responses there and assess the effects of further changes on robustness. For this application, the project team achieved a solid improvement and chose not to continue the experiment. In future blog posts, we will illustrate the benefits of including more than one phase in the QbD experimental program.

Stochastic emulators are a primary Quality by Design tool, which naturally incorporates simulation experiments in the design of drug products, analytical methods and scale-up processes. (For more on computer experiments and stochastic emulators see, Santner et al., 2004; Steinberg and Kenett, 2006; Levy and Steinberg, 2010; Kenett and Zacks, 2014.) For more examples of such experiments in the context of generic product design, see Arnon (2012, 2014).

We showed in this blog post, how a combination of physical full factorial experiments with a stochastic emulator leads to robust set up within a design space.

Coming attraction

The next blog post will be dedicated to mixture experiments and compositional data where the main interest is on the relative values of a set of components (which typically add to 100%) so that the factors cannot be changed independently.


  • Arnon, M, (2012), Essential aspects in the ACE and ANDA IR QbD case studies,
  • The 4th Jerusalem Conference on Quality by Design (QbD) and Pharma Sciences, Jerusalem, Israel, http://ce.pharmacy.wisc.edu/courseinfo/archive/2012Israel/
  • Arnon, M. (2014). QbD In Extended Topicals Perrigo Israel Pharmaceuticals, The 4th Jerusalem Conference on Quality by Design (QbD) and Pharma Sciences, Jerusalem, Israel. https://medicine.ekmd.huji.ac.il/schools/pharmacy/En/home/news/Pages/QbD2014.aspx
  • Bates, R., Kenett R.S., Steinberg D.M. and Wynn, H. (2006). Achieving Robust Design from Computer Simulations. Quality Technology and Quantitative Management, Vol. 3, No. 2, pp. 161-177.
  • Kenett, R.S. and Steinberg, D.M. (2006). New Frontiers in Design of Experiments, Quality Progress, pp. 61-65, August.
  • Kenett, R.S. and Zacks, S. (2014). Modern Industrial Statistics: with applications in R, MINITAB and JMP, John Wiley and Sons.
  • Levy, S. and Steinberg, D.M. (2010). Computer experiments: a review. ASTA – Advances in Statistical Analysis, 94, 311-324
  • Sacks, J., Welch, W.J., Mitchell, T.J. and Wynn, H.P. (1989). Design and analysis of computer experiments. Statistical Science, 4, 409-435.
  • Santner, T.J., Williams, B.J. and Notz, W.L. (2004). The Design and Analysis of Computer Experiments, Springer, New York, NY.
  • SUPAC (1997) Food and Drug Administration, Center for Drug Evaluation and Research (CDER) Scale-Up and Postapproval Changes: Chemistry, Manufacturing, and Controls; In Vitro Release Testing and In Vivo Bioequivalence Documentation, Rockville, MD, USA.

About the Authors

This blog post is brought to you by members of the KPA Group: Ron Kenett, Anat Reiner-Benaim and David Steinberg.

Ron Kenett

Ron Kenett

Anat Reiner-Benaim

David Steinberg

David Steinberg


Post a Comment

Potato chip smackdown: Winners and losers

When I was growing up in the Midwest (Columbia, MO, to be precise), flavored potato chips were a favorite of mine. Though I preferred sour cream and onion, barbecue would do in a pinch. Imagine my delight, then, when my colleague Ryan Lekivetz informed me that our neighbors to the North had an entire new range of chip flavors to try!

And imagine my disappointment when I found out that two of the most popular flavors were ketchup and dill pickle.

“Blech,” I told Ryan.

“You’re just saying that because of the name,” he replied.

JMP staff volunteered to taste 10 different potato chip flavors in our choice experiment.

JMP staff volunteered to taste 10 different potato chip flavors in our choice experiment.

So, as statisticians will, we agreed to find out the answer based on a designed experiment.

We selected 10 potato chip flavors, including Canadian Dill Pickle and Canadian Ketchup, and asked people to compare them. Because we were working with volunteers from the US, we decided we’d mix in a few US favorites like barbecue and my beloved sour cream and onion, but we’d also use the Lay’s Do Us a Flavor chips to see how the Canadian chips would fare against other unfamiliar flavors. In the end, we had 10 different types of chips for our volunteers to taste:

  • New York Reuben
  • Southern Biscuits and Gravy
  • West Coast Truffle Fries
  • Greektown Gyro
  • Ketchup
  • All Dressed
  • Dill Pickle
  • Southern Heat Barbecue
  • Barbecue
  • Sour Cream and Onion

Like with the chocolate smackdown, we felt that asking people to taste 10 chips and rank them best to worst would be too difficult. Instead, we needed a designed experiment to break the 10 chips into manageable groups. The twist here would be that we’d not only ask for the “best” or “favorite” in each group, but we’d also ask for the “worst” or “least favorite” in each group. This would allow us to perform a MaxDiff (or Maximum Difference Analysis), another type of Choice analysis that would enable us to get more information from each choice task.

Ryan explains how he developed the experimental design in his blog post: Potato Chip Smackdown: US vs. Canada. Ryan’s design gave us four sets of potato chips, each with three chips to taste. As before, there were three different surveys, giving us a larger total number of comparisons. We gathered 30 volunteers to do the tasting (which took more coaxing than our last experiment, where we asked our volunteers to taste chocolate). Then, it was time to analyze.

The analysis

The MaxDiff model can be analyzed as if the respondents are making two choices. First, they pick the best of the offered choices. Second they pick the worst. Each one is treated like a separate choice. Each respondent, then, makes two choices for every group of chips he or she is shown. In the data, we represent this with two sets of entries for each choice set:

Data Table

If we focus on Choice Set ID 1, we see that there are three rows for the best choice and three rows for the worst choice. We use indicator variables to tell JMP which chips were in the set. For the rows that represent the best choice, the indicators are coded with a 1. For the rows that represent the worst choice, the indicators are coded with a -1. Each of the indicator columns is entered as a continuous variable in the choice model, so JMP is forced to make the parameters associated with the “Worst Choice” rows equal to the negative of the parameters associated with the “Best Choice” rows.

In the model dialog, we make sure that the “Best or Worst Selection” column (the column that tells whether the rows are associated with the best choice or the worst choice) is added to the grouping columns:

Choice Dialog

We also have to select one flavor as our comparison flavor and leave its indicator out of the model (just like you would with any set of dummy-coded columns). In this case, I made sour cream and onion our comparison.

In order to get a ranking of best flavors to worst flavors, I sorted the resulting report (right-click and select “Sort by column”). Barbecue and Southern Barbecue are near the top. The Likelihood ratio tests are testing whether the parameters are significantly different from 0, so in this case, whether flavors are significantly different from sour cream and onion. It looks like the two familiar barbecue flavors are favorites, while Gyro definitely trails the others.

Visualizing the results

I also created a graph of the results (Right-click the estimates table and select “Make into Data Table." Then I opened Graph Builder and followed the steps described in this blog post.


If we just use the graph, Truffle Fries is the next most popular after the barbecue flavors, followed by Biscuits and Gravy. The Canadian flavors are near the bottom, but the statistical tests tell us that if we’d sampled different people (or the same people on a different day), those flavors might sort higher.

So, who's right?

Maybe we were both right. Maybe I was just reacting to the name “ketchup chips."  Maybe if I tasted one of them in a blind test, I’d like them just as much as sour cream and onion. Maybe if I’d grown up in Canada, I’d like them better than barbecue.

I’d be really surprised, though, if Gyro won the “Do Us a Flavor” challenge.

Post a Comment

Potato chip smackdown: US vs. Canada

Four rows of three potato chips

Each taster in the choice experiment got a set of potato chips like this. Can you guess which one is the ketchup-flavored chip?

I grew up in Canada, where ketchup potato chips were a staple at most children's birthday parties. As a huge fan of these ketchup chips, I was unsure whether my enjoyment of the wonderful flavor was simply nostalgia and whether other people unfamiliar with the flavor would show any love for ketchup chips. Similarly, I’ve noticed that dill pickle flavored chips and the ever-popular "all dressed" flavor are uncommon here in the US compared with Canada.

After our chocolate smackdown was such a success with the JMP group, Melinda Theilbar and I were looking for another opportunity to conduct a choice experiment. To make it fun, we wanted the experiment to contain a number of flavors new to the tasters. The fortuitous timing of Lay’s chips new “Do Us a Flavor” campaign and a visit from my parents in Canada presented the perfect opportunity: potato chip smackdown!

While we figured that tasting chips would be not as popular as tasting chocolate, we also did not have as many worries about limiting the number of tasters due to limited resources -- the chip bags contained plenty of chips. The ideal mix of flavors for the experiment would be a combination of mostly unfamiliar flavors, with a couple of more familiar flavors thrown in. Although the chocolate experiment had six different chocolates among two factors (with two origins and three cacao contents), this time we have just one factor with more levels. We ultimately ended up with 10 flavors:

  • New York Reuben
  • Southern Biscuits and Gravy
  • West Coast Truffle Fries
  • Greektown Gyro
  • Ketchup
  • All Dressed
  • Dill Pickle
  • Southern Heat Barbecue
  • Barbecue
  • Sour Cream and Onion

How to get the response?

What we want is a general ranking of the desirability of each of the flavors. One way to do this is to get tasters to provide their own ranking of the flavors. However, with tasting so many different flavors (and the fact that we didn’t want to stuff people with potato chips), this type of forced ranking seemed like it would be a difficult task requiring lots of little bites.

Picking the favorite among a smaller group of choices is much less demanding to the tasters, but carries less information. There are 45 (10 choose 2) different pairwise comparisons for the 10 potato chips. While we don’t require all of these pairs to appear in the survey, if we give tasters five or six pairs of chips, it doesn’t sound like we’re getting much information.

Another option: Size 3

Instead of choice sets of size 2, we decided to use choice sets of size 3. Doesn’t sound like a big deal? Consider this: In picking a favorite, we actually get information comparison information for the favorite versus each of those that were not chosen.

That’s not all

In addition to reporting the favorite from each choice set, we also had the tasters report their least favorite flavor from each choice set. If we just asked for the favorite in each choice set, we don’t get any information about the two flavors not chosen in relation to each other. By asking for the favorite and least favorite, we gain that information.

OK, but can you still analyze it?

Analysis is going to require some extra work, but it’s certainly doable within JMP. Melinda will go through the analysis next time (sorry, you’ll still have to wait for the results – no hints yet!).

The design

We can use the Custom Designer to create a design in much the same way that we did for the chocolate experiment. It was decided that each taster would get four choice sets, and four different surveys would be created.

1. Add a false VHTC and false HTC factor to set up random blocks corresponding to surveys and choice sets, along with a 10-level factor for flavor.


2. Remove the false VHTC and HTC effects from the model.


3. Set up the number of whole plots for the total number of surveys (4), subplots the total number of choice sets (16 = 4*4), and the run size as the number of choice sets times the number of chip flavors per choice set (48 = 16*3).


Final thoughts

Melinda and I have completed this experiment, but I don't know the results yet. The choice sets of size 3 did seem to be a good choice in this case, as the tasters found it reasonably easy to make the ranking for each set they were presented with. I didn’t have a notion that there was a clear-cut winner as the results were being collected, but there’s one particular flavor that I suspect will not fare very well. Thanks for reading!

Post a Comment