Learning about new self-tracking technology at QS15

My talk at the opening plenary at QS15 included this and many other visualizations. (Photo courtesy of @whatify)

My talk at the opening plenary at QS15 included this and many other visualizations. (Photo used with permission of @whatify on Twitter.)

I recently attended the QS15 Quantified Self Conference and Expo in San Francisco. This three-day event brought self-tracking technology providers together with users who are actively collecting and analyzing their own data. The conference is organized by QS Labs, a Bay area company founded by Kevin Kelly and Gary Wolf. These two Wired Magazine editors collaborated on early work on the Quantified Self and organized the first QS meetup group. Today hundreds of QS groups meet around the world.

Even if the concept of the quantified self (QS) is new to you, you probably know someone who is using a Fitbit, an Apple watch or a smart phone to track their steps, or perhaps an app like MyFitnessPal to track their daily eating habits. If you’ve read earlier entries in my Fitness and Food blog series, then you know step tracking and meal logging are two of the daily practices I have adopted to help me understand and manage my own weight fluctuations over time.

Like many quantified selfers, I  collect data on a biomarker that is of great personal interest to me because it has been a trouble spot throughout my adult life. But weight is only one of many outcomes being measured by those in the QS community. The interests of this group include a variety of different health-related measures, as well as life logging, dream tracking, assessing personal productivity and cognitive function, and much more -- as you can see from the conference’s packed agenda of show-and-tell sessions, small group discussions, and toolmaker talks. I had a chance to study others’ data collection efforts, visualization and analysis approaches and even identified several new metrics I want to track. I tried out various new tracking technologies in the Expo Hall and have already adopted a few of them. If only my QS budget was unlimited...sigh!

Discovering the QS community
In retrospect, it seems obvious that my journey to improve my health metrics would eventually lead me to connect with the vibrant and varied QS community. When I first learned about QS in 2010, I was already well aware of the value of self-tracking, having collected years of diet and exercise data in notebooks. Between mid-2009 and late 2010, while losing 45 pounds in preparation for a second pregnancy, I relied on my daily records to motivate and educate myself about what was and wasn’t working for me. When I saw Gary Wolf’s Quantified Self TED Talk in 2010, I was thrilled to hear about new devices and apps that could help me move aspects of my monitoring process off the page and into smart phone apps and web sites with data export capabilities. Around that time, I began collecting activity and food logging data with a BodyMedia FIT armband, which was the basis of the e-poster I presented at the last JMP Discovery Summit.

Although I collected data with my armband from December 2010 until March 2015, my engagement with the QS community started much more recently. My experience with my own data was largely solitary until 2014, when I finally tackled my data export/import challenges and got my data into JMP. Later that year, my dad saw a NOVA special on QS and sent me the link to Gary’s 2010 talk. Déjà vu hit me the moment I began to watch it, and I realized I had seen it long before. At the time, I found it amusing that my dad was so relieved that there was a whole movement of people just as obsessed with their data and devices as I was with mine! I posted a link to my Discovery Summit project on the QS forum, which helped me connect with QS Labs program director Ernesto Ramirez for a podcast and eventually resulted in an invitation to speak at the conference.

Using JMP to visualize -- and understand -- the quantified self
I had the chance to tell Gary Wolf the story of how his 2010 talk inspired me just before I stepped on stage to open the conference with my short show-and-tell talk. Having the opportunity as a total QS newbie to share what I did, how I did it, and what I learned in the opening plenary session was an experience I will never forget. Steven Jonas from QS Labs reviewed my talk with me before the conference and shared his thoughtful perspective on my slides. Steven suggested that I put the visualizations of my data front and center, which I did.

Stepping on stage with the full power of JMP -- especially Xan Gregg’s Graph Builder platform -- behind me felt pretty amazing. Besides providing the platform I use daily to explore my work-related and personal data sets, I have often relied on Xan’s expertise to suggest improvements and simplifications to my graphs. Xan (who will be a keynote speaker at this year’s Discovery Summit) often posts graph makeovers here on the blog, and in fact, I’ve shown “before and after” versions of graphs in my posts that Xan helped me improve. Graph Builder was a popular topic in the informal “office hours” session I hosted at QS15, where conference attendees dropped by to see my projects and ask questions about JMP. I hope I have the chance to return to the conference in the future to share the insights I know I will gain over my next year of data collection and analysis.

My favorite new devices
In keeping with the technology focus of the QS15 conference, I thought it would be fun to highlight what I have learned from exploring data from several new tracking new devices I've adopted recently. I hope my coming posts will give you some ideas about new things to try with your own data tables!

  • Skulpt Aim. The Aim is a handheld device just a little thicker than an iPhone that you can use to assess body fat percentage and muscle quality for a variety of distinct muscles areas in your body. I am not exaggerating when I say that I am obsessed with the Aim. I think it has the potential to be the most useful device I have encountered yet for tracking incremental progress from my strength training workouts and assessing impact of targeted weight training and rest weeks where I take a break from my training sessions.
  • Fitbit Charge HR. I acquired this activity tracker about four months ago. Unlike my BodyMedia armband, I have worn the Charge HR very consistently during the past few summer months. I have seen a few things I expected in the data from this new device, but I’ve also seen some surprises that are worth considering. I’ll share some of those in a later post.  (By the way, if you are a Fitbit user who didn’t want to pay for your data export when it was a premium feature, Fitbit recently made export free to all users. At the moment, you can only export one month at a time, so I still prefer to use Eric Jain’s Zenobase tool to retrieve and store my BodyMedia and Fitbit data. I had the pleasure of meeting Eric at QS15 where he hosted a session about his web-based software.)
  • Push Strength. This is an armband that I use to monitor my weight training workouts. Fortunately, there are no tan lines to worry about with this one since I only wear it inside during weight training workouts. The Push band complements the information I already have about the weights, sets and reps completed during past workout sessions, providing me rep level data about the velocity, power and timing of my reps (among other metrics). The Push app automatically summarizes total workout length and total active time during a workout session. I'll be presenting on my weight training data at JMP Discovery Summit in San Diego in September, and you can get a sneak peek at that material in my past blog posts on the topic. If you come to my talk, you will see how to apply the approaches I used to drill into my weight training workout data to your own custom maps. To get even more details about this feature while at the conference, be sure to stop by developer Dan Schikore’s poster on selection filters, introduced in JMP 12 to allow you to use graphs as data filters.
  • Polar H7. I picked up this relatively inexpensive Bluetooth-enabled heart rate monitor strap after learning about heart rate variability studies at the conference. Like I do with many of my devices when I first acquire them, I've been collecting data with apps over several weeks to get a sense of what my normal variability patterns are.

I hope you will join me for later posts about my data, and come find me if you’ll be at Discovery Summit in San Diego this September!

Post a Comment

Using data points to fill a graphic with color

When dealing with graphs and plots, we will very likely need to fill colors in a graph to highlight an area or distinguish it from other shapes. You may know how to shade regular shapes, but what about irregular polygons and contours? You can do can this easily to any graphical shapes using Polygon(). We just need the data points!

blog 1

Above is a graph I was working on for patient recruitment prediction (Look for a blog post about this soon). The dash-dot lines are 2.5th and 97.5th percentiles of 10,000 simulations. The solid red line is the mean of the simulated values. Filling this confidence interval with color would help people understand the scope of the enrollment prediction, so I explored how to do that. In the beginning, I specified the three vertices of the triangle and got this:







Hmm…not quite what I expected. What’s the problem? Apparently this graphic is not exactly a triangle, although it sure looks like one. The bottom part is wider, and therefore the triangular shading could not cover the entire confidence interval band. How can we fix this?

To fill a graphic with color, you first need to know the boundary of your shape and the data points that make up the boundary. In this case, my boundaries are three sides of the “triangle,” and the data points are the 2.5th percentiles and 97.5th percentiles. Note that I do not need the third side, the horizontal line segment, because once I construct the two sides, JMP automatically connects the two end points with a straight line. But if you have a segment that's not straight, you would need data points to tell JMP where the shading stops.

Next, in JSL, we pull out the data points from the data set, save the data in columns and sort them.

ds = Data Table( "graphbuilder" );
ds << select where( :data == "Predicted 2.5%" );
dslow = ds << subset(
	Output Table Name( "Lower Percentile" ),
	Selected Rows( 1 ),
dslow_sort = dslow << sort( By( :time ), Order( Ascending ) );
ds << select where( :data == "Predicted 97.5%" );
dsupp = ds << subset(
	Output Table Name( "Upper Percentile" ),
	Selected Rows( 1 ),
dsupp_sort = dsupp << sort( By( :time ), Order( Descending ) );

The two subsets, dslow and dsupp, contain data points of our boundary, the two dash-dot lines. Note that I sorted dslow by ascending order but dsupp by descending order since JMP draws the boundary with direction. I chose to draw the CI band clockwise: starting from the intersection point of the blue line and the red lines, going up following the 2.5th percentile, and coming back following the 97.5th percentile from the horizontal side. Thus, the data points of the 97.5th percentile need to be in descending order.

Now that we’ve got the data points sorted, we need to put them together in a matrix named “all”. The x values in the matrix is “xall” and y values in the matrix is “yall”. Here, time is my x-axis and number is my y-axis.

Col1 = Column( dslow_sort, "time" );
tl = col << GetAsMatrix;
Col1 = Column( dslow_sort, "number" );
nl = col << GetAsMatrix;
Col2 = Column( dsupp_sort, "time" );
tu = col << GetAsMatrix;
col2 = Column( dsupp_sort, "number" );
nu = col << GetAsMatrix;
lower = tl || nl;
upper = tu || nu;
all = lower |/ upper;
xall = all[0,1];
yall = all[0,2];
Pen Color( "red" );
Fill Color( "red" );
Transparency( 0.2 );	
Polygon(xall, yall);

The final step is easy: Call the matrix in Polygon() and fill in color. Use the code from the second box above in your Graph Builder statement or in a new window. You should now have a triangle band perfectly shaded in red with some degree of transparency!

Final thoughts: You can fill any shape of area with color as long as you have the data points. Below is another example of graphic script shading, which demonstrates just one of the many variations you can create by changing the options.


Have fun shading!

Post a Comment

The QbD Column: A QbD fractional factorial experiment

The first two posts in this series described the principles and methods of Quality by Design (QbD) in the pharmaceutical industry. The focus now shifts to the role of experimental design in QbD.

Quality by Design in the pharmaceutical industry is a systematic approach to development of drug products and drug manufacturing processes. Under QbD, statistically designed experiments are used to efficiently and effectively investigate how process and product factors affect critical quality attributes. They lead to determination of a “design space,” a collection of production conditions that are demonstrated to provide a quality product. A company making a QbD filing gets regulatory relief, meaning that changes in set-up conditions within the design space are allowed without needing pre-approval. The statement often heard is that the QbD design space is about moving from “tell and do” to “do and tell.” As long as changes are within the design space, regulatory agencies need only be informed about them.

Study of nanosuspensions: An example
In this post, we look at considerations in planning a statistically designed experiment, collecting the data, carrying out the statistical analysis, and drawing practical conclusions in the QbD context. The example is based on Verma et al. (2013).

The goal of the experiment is to explore the process of preparing nanosuspensions, a popular formulation for water-insoluble drugs. Nanosuspensions involve colloidal dispersions of discrete drug particles, which are stabilized with polymers and/or surfactants, such as DOWFAX. Nanosuspensions achieve improved bioavailability by using small particles, which increases the dissolution rate for drugs with poor solubility. After beginning with larger particles, the process then uses milling to reduce their size. The current study examines the use of microfluidization at the milling stage.

What were the CQAs?
Verma et al. studied several critical quality attributes (CQAs) in the experiment: the mean size of the particles at the end of the milling stage and after four weeks of storage; and the zeta potential, which serves as an indicator of stability by measuring the electrostatic or charge repulsion between particles. Verma et al. provided additional data that track the particle size distribution throughout the milling process, as well as both particle size and zeta potential during storage. We will focus here on the outcomes obtained at the end of milling and storage. In a subsequent post, we will describe an analysis that takes into account the trajectories of these outcomes over time.

Although Verma et al. did not state specific target values for these CQAs, they wrote that mean particle size should be as small as possible and zeta potentials should be as far from 0 as possible (indicating greater stability).

What were the process factors?
The team chose to include five different factors in the study. The factors are listed in the table below, along with their experimental settings. The first factor, indomethacin, is the drug that was prepared by nanosuspension. As noted, there were three different concentrations of the drug.

What was the experimental design?
The experiment was planned as a two-level fractional factorial with center points. The fractional factorial was a 25-1 design that used the extreme levels of each of the quantitative factors. Six center points were added. The 25-1 design permits estimation of all the main effects and all the two-factor interactions.

The center points make several useful contributions to the design. First, they can be used to compute estimates of experimental variation that do not depend on choosing some model to fit to the data. Second, they provide some check as to whether there is a need for pure quadratic terms in relating the CQAs to the factors. If so, then they help us to remove bias from our estimates near the middle of the design region. And if not, then they lead to better variance properties for our predictions in the center of the region. Figure 1 shows the JMP Variance Profiler for the design, assuming only main effects are important, and taking the center point for the numerical factors as the point of reference. If the six center points are deleted, the variance at the reference location increases by almost 40%, from 0.091σ2 to 0.125σ2. The design evaluation tool is very helpful here in deciding how many center points to include.

Figure 2: The Variance profiler for the 22 run design with six center points

Figure 1: The Variance Profiler for the 22-run design with six center points.


Figure 1: Choosing a design with the Screening Design platform

Figure 2: Choosing a design with the Screening Design platform.

How can we generate the design?
Textbook examples typically show center points when all factors are numerical and can be set to an intermediate level. In this experiment, four factors are numerical, but one, the choice of stabilizer, is categorical. The natural choice in this setting is to use each stabilizer for half of the center points, which was the design we used here.

It is easy to construct the design in JMP with the Screening Design platform in DOE. Enter the four numerical factors with their extreme settings, and enter the stabilizer as a two-level qualitative factor. Choose the 16-run fractional factorial without blocking (which was not needed in this study). Then indicate that you want six center points. Figure 2 shows the final screen in this process. Clicking on Make Table produces the design that we used in this study.

What did the analysis find?
Our analysis includes all the main effects and two-factor interactions. How should we handle the center points? There are two possibilities. If we ignore the center points in specifying the model, JMP will recognize them and will include a Lack of Fit summary in the output. The summary compares the average at the center points to the average of the factorial points and also computes the “pure error” variance estimate from replicates at the center. For the Verma et al. experiment, there are really two center points, one for each stabilizer, and hence two center point averages. Consequently, the six center points provide 2 df for lack-of-fit (one for each stabilizer) and 4 df for pure error.

In the second approach, we begin by generating a new “center point” column in the data sheet that is equal to 1 for the center points and 0 for the factorial points. Including a main effect for the center point column and its interaction with stabilizer picks up the 2 df for lack of fit. The model now has exactly 4 df for error, corresponding to the pure error. The advantage of this method is that the p-values for the factorial effects will now be computed using the pure error variance estimate.

Figures 3, 4 and 5 show the sorted effects for the factors and their interactions. The effects are given in terms of regression coefficients. Because the factors were entered here with real levels, not generically coded to -1 and +1, the size of the coefficients depends not just on the strength of the factor, but also on the spread of the levels. The spread also affects the standard error of the coefficients. Consequently, the JMP Factor summary sorts and graphically displays the strength of the effects in terms of their t-statistics.

Figure 3: Sorted parameter estimates for the mean particle size at the end of the milling phase

Figure 3: Sorted parameter estimates for the mean particle size at the end of the milling phase.


Figure 4: Sorted parameter estimates for the mean particle size at the end of the storage phase

Figure 4: Sorted parameter estimates for the mean particle size at the end of the storage phase.


Figure 5: Sorted parameter estimates for the zeta potential

Figure 5: Sorted parameter estimates for the zeta potential.

What can we conclude about particle size?
The mean particle size ranged from below 500 to almost 1,000 nanometers after milling. Typically, there was a modest increase in mean size after storage. Figure 6 shows the high correlation between the mean particle sizes before and after storage. Not surprisingly, the same factors show up as dominant in analyzing both of these responses. The strongest factor is processing pressure, with substantially lower mean particle size at high pressure.

The stabilizer is also important, with DOWFAX resulting in smaller particle sizes. Increasing the concentration of the stabilizer has a modest reduction effect, which is statistically significant after milling but at the border of significance (p=0.035) after storage. The concentration of indomethacin and the processing temperature have relatively small effects. Both reach statistical significance after milling, but they are very small compared to the effects of pressure and stabilizer.

The analysis clearly shows that increasing pressure and using the DOWFAX stabilizer are the keys to achieving small particle size. Increasing the concentration of the stabilizer reduced particle size after milling, but that effect was small after storage. Lower concentrations of indomethacin and higher temperatures also led to smaller particle sizes, but these effects were small by comparison to those of pressure and stabilizer.

The center point average is significantly higher than that at the factorial points, suggesting some nonlinearity. A first guess would be that pressure is the factor with the nonlinear effect, as it is clearly the strongest numerical factor. There are also significant center point by stabilizer interactions. Figure 7 shows a plot of mean particle size after milling against pressure, with points coded by stabilizer. The results with HPMC appear quite linear, but those for DOWFAX show clear curvature. This suggests that, with the DOWFAX stabilizer, the benefits from increasing pressure are not as pronounced once the pressure exceeds 14,000 psi.

Figure 6: Plot of mean particle size after 28 days of storage vs. mean particle size after 90 minutes of milling, before storage

Figure 6: Plot of mean particle size after 28 days of storage vs. mean particle size after 90 minutes of milling, before storage.


Figure 7: Plot of mean particle size after 90 minutes of milling vs. pressure, coded for stabilizer. Pressure is given in units of 1,000 psi.

Figure 7: Plot of mean particle size after 90 minutes of milling vs. pressure, coded for stabilizer. Pressure is given in units of 1,000 psi.

What can we conclude about the zeta potential?
The zeta potentials ranged from -8 to -76. As all are negative, factor settings that lead to “more strongly negative” potentials are desired. The dominant factor affecting the zeta potential was the stabilizer, with DOWFAX leading to results lower, on average, by about 53 units than HPMC. There was also a modest effect of the stabilizer concentration. The effect of the concentration was limited to the formulations with DOWFAX, with only a small effect using HPMC. It is easy to see this change in the effect of the concentration using the Prediction profiler. Figure 8 shows the profiler for the “preferred conditions”: DOWFAX, high stabilizer concentration and high pressure, leaving indomethacin and temperature at their center levels. Sliding the vertical bar for stabilizer to the right end of the plot (for HPMC) shows that the effect of concentration on the zeta potential becomes much weaker.

Figure 8: Prediction profiler at the preferred conditions for achieving small particle size and large absolute zeta potentials

Figure 8: Prediction Profiler at the preferred conditions for achieving small particle size and large absolute zeta potentials

Recommendations for design space
The experiment showed that small mean particle size can be achieved by using high pressure and the DOWFAX stabilizer. There is a modest effect of stabilizer concentration, with higher concentrations giving smaller particle size. The primary effect on the zeta potential is the stabilizer; again, DOWFAX produced better results. With specification limits for these outcomes, the design space can then be determined much as we did in the second post of this series. The small effect of temperature implies that it can be set largely by other criteria, such as ease of operation or reduced cost, without harmful effects on the CQAs. The small effect of the indomethacin concentration implies that the findings are broadly applicable to the production of this drug.

Coming attractions
In the next post, we will show how to achieve a robust design set up, within a design space, using a stochastic emulator.

Verma, S., Lan, Y., Gokhale, R. and Burgess, D.J. (2009). Quality by design approach to understand the process of nanosuspension preparation. International Journal of Pharmaceutics, 377, 185-198.

About the Authors

This series is brought to you by members of the KPA Group: Ron Kenett, Anat Reiner-Benaim and David Steinberg.

Ron Kenett

Ron Kenett

Anat Reiner-Benaim

David Steinberg

David Steinberg

Post a Comment

Graph Makeover: Where same-sex couples live in the US

The following map appeared in an article titled "Where Same-Sex Couples Live" in the Upshot section of The New York Times shortly after the US Supreme Court decision ruled that the Constitution grants the right to same-sex marriage throughout the US.


The map coloring shows the proportion of same-sex couples in each county in 2010. The numbers are necessarily approximate; the Upshot uses Gary Gates’ adjustments of the raw American Community Survey (ACS) data to account for coding errors.

One feature of the map that struck me as remarkable is the amount of variability throughout the country. However, it also reminded me of Howard Wainer’s chapter on “The Most Dangerous Equation” in his book Picturing the Uncertain World. He calls it “de Moivre’s equation,” which dictates that smaller samples have increased variability. In particular, the standard deviation about the mean of a group of samples is inversely proportional to the square root of the sample size. He gives examples based on disease rates and school performance. In each case, smaller population samples yield the highest and the lowest rates.

It’s not hard to see why with an example from this data set. Douglas County, South Dakota, has one of the highest proportions of same-sex couples in the country at 17.4 per 1,000, while nearby Hanson County has the minimum of 0. Each of these counties has fewer than 1,500 households, and given the sampling rate of the ACS for South Dakota, we can estimate that fewer than 30 households were sampled in each county. So one same-sex couple in 30 respondents for Douglas County looks like a relatively large proportion even after Gates’ adjustments. Meanwhile 0 same-sex couples in 30 looks like none for all of Hanson County.

There are “small area estimation” techniques for dealing with some of these problems. For instance, averaging nearby counties together can help smooth the extremes at the expense of possibly losing information. Another technique is to combine successive years of data, but in this case, 2010 was the first year the survey asked about unmarried partners.

My interest, though, is in finding a way to better see “where same-sex couples live,” which is the title of the Upshot article. The text of that article is careful to compare only rates for large counties, but the map has no such qualifications. Can we show the uncertainty somehow?

Funnel Plot

One graphical technique for understanding proportions with different sample sizes is a funnel plot. A funnel plot is scatterplot of the proportions versus their corresponding sample sizes. For low samples sizes, you expect more variation. Assuming all of the samples are from the same population, we can draw curves that correspond to where we expect most of the proportions to fall. Dots far outside the curves are likely outliers, possibly not really part of the same population after all.

Here’s a funnel plot of the same-sex couple proportions, with some of the points labeled. For counties containing large cities, I’ve used the city name as the label.


The orange line shows the (weighted) grand mean, and the curves show the confidence intervals based on de Moivre’s equation. With 3,100 counties, we’d expect a few just outside the 99.8% interval assuming a normal distribution. But instead, we have many such counties, some of them far outside the interval. This suggests there is something systematically different about them, not just common random variation. Looking at the labels, we can see many large cities and a few college towns.

Z-Score Map

We can see that Douglas County has the sixth-highest proportion of same-sex couples, according to the adjusted data -- but it’s well below the 95% line, and so it's not that remarkable. How can we represent that on a map? One idea is to color each county by its distance above or below the mean relative to the curves. That is, we color it by a z-score, the number of standard errors above or below the mean. The inner curve represents a z-score of 1.96, and the outer curve represents a z-score of 3. Here is the resultant map.


I made a custom blue-gray-red color scale with extra dark colors at the extremes and mapped to the z-score range -4 to 4, based on the mean of the non-outliers. All extreme outliers show up as the same dark red or blue, which loses some information at the edges so the middle doesn't get washed out. Of course, the proportion can’t go below zero, so only counties with large sample populations could show up as low extremes, but none do.

From this map, we can see the following:

  1. There are some extreme high outliers, mostly around big cities.
  2. If you know your US geography, you can notice that some smaller cities like Madison, Wisconsin, and Asheville, North Carolina, have high z-scores. (See my JMP table for an interactive version with hover labels.)
  3. Most of the country is grayish and in the range of unremarkable variation, neither obviously higher or lower than the mean. South Dakota certainly has less extreme variation than before.
  4. We can still see a cluster of high values in New England though it's more concentrated than before.
  5. The cluster of higher values in northern Wisconsin is still there but less pronounced.

The map also reminds us of some shortcomings of choropleth maps in general. In addition to the inaccuracy of color, the areas are bound to political borders and have irregular sizes. Some counties, such as San Francisco in California are so small that we can barely see them, and others, such as in the Southwest, have big areas that are dominated by localized populations.

Extreme Values

If we really want to know where the same-sex couples live, we might pair the map with a chart of the highest- and lowest-scoring counties. Here are the counties with a z-score above 4. There are none below -4 or even -3; Utah County, Utah, has the lowest z-score at -2.98.



This has been a challenging data set to work with. While it’s interesting and potentially insightful, we have to remember that the sampling rates are low in places and the proportions are small enough that they can be affected by coding errors and other behaviors. Furthermore, proportions don't behave quite like "regular" numbers, but the Central Limit Theorem and the large number of counties gives us some comfort in using a normal distribution. With cleaner, integer data, we could have used a Binomial distribution for the confidence curves, as in Rick Wicklin's post on funnel plots.

My data file with graph scripts is available in the JMP User Community, and an add-in for making funnel plots in JMP will be available soon.

Post a Comment

Why it's important to brainstorm factors and levels in a designed experiment

The best time to plan an experiment is after you’ve done it – R.A. Fisher

If you’ve read through my previous blog posts, I usually mention issues discovered during an experiment that I would change if I were to do the experiment again, or things to consider in the subsequent experiments. While I sometimes mention the struggles with choosing an appropriate response, I don’t typically dwell on the importance of choosing the factors and levels. However, choosing ranges poorly and neglecting to consider some factors can diminish the value of the results from an experiment.

three customized diecast cars

I could have gotten more bang for my buck in my designed experiment had I brainstormed the factors and levels with a colleague first. (Photo courtesy of Caroll Co)

A recent experiment I blogged about involved dyeing toy cars based on a number of suggestions I found online. After the first blog entry was posted, there was a knock on my office door from my colleague Lou Valente. He had some ideas for how I might modify the design through different factor ranges and additional factors.

Unfortunately, the experiment was already complete and the data collected, but the experience was a useful reminder of the value in soliciting feedback from different members of a team, particularly when you have access to a domain expert. Lou’s chemical manufacturing experience and passion for DOE gave him insights that escaped my searches. A short discussion with Lou provided far more ideas and understanding of the situation than the few hours I spent searching online.

It can also be useful to have a fresh set of eyes of someone without the expertise, since they may have ideas that wouldn’t even occur to an expert. I won’t delve into the ideas that Lou and I talked about right now, but you can expect to see some of the ideas reflected in a future blog post. I have already been purchasing extra cars for the next experiment.

Fortunately, we did have some positive results in the toy car experiment. Even if the results were not ideal, experimentation is a sequential process. A designed experiment leaves you with more knowledge about the system that you had before, and almost always provides directions to look to for the next experiment.

I was lucky in this case that my valuable reminder wasn’t all that expensive -- the experiment didn’t cost much outside of a few hours of time and some toys cars and fabric dye. But I would have gotten a better bang for my buck if I had talked with Lou first. I’m curious to hear comments as to how you like to choose factors and levels when you experiment.

Who would have guessed that information you gather online isn’t always reliable?

Post a Comment

Creating a JMP 12 Recode script from a Lookup Table containing original and new values

I recently used a JMP add-in I wrote to import my complete set of BodyMedia FIT food log data files, including data from Dec. 21, 2010, through the last day I logged my meals in that software on March 29, 2015. My final data table contained 39,942 rows of food items names. When combined with the 60 days (1,551 rows) of data from my MyFitnessPal food log I have been keeping since I switched devices from BodyMedia to a FitBit Charge HR late last March, I have nearly 41,000 rows of food log data!

One essential step in my data preparation process for this data table has been to clean up the food item names by consolidating similar items under a single value. Without this cleanup step, I end up with lots of unique item names that really represent the same food item. For example, I ate a variety of different dark chocolates, and indicated the correct brand names where available, but often had to substitute as many specific items were not available in the BodyMedia database. This made exact names even less meaningful, so I felt it made sense to aggregate all varieties under a single name (“Candy Bar, Dark Chocolate”).

I cleaned up the food log data table that I had presented at Discovery Summit last fall using the Recode platform in an early adopter install of JMP 12. At the time, Recode lacked the new Save/Load Script options that I now rely on for all my data cleanup projects. Instead, I recoded my items and created a lookup table that listed the original and recoded names for each unique item that appeared in my food log. I updated this lookup table each time I added new items to my food log. A section of my lookup table is shown below.

Lookup table

Before I left to attend the QS15 conference in San Francisco last month, I took advantage of JMP formula columns to create a script that could be reloaded into the Recode dialog to recapture my item groupings so that I could update them. This made it simple to recode new food item names and save the updated script in case I needed to tweak my work in the future.

Since I found this script creation trick so useful, I thought I would share it. If you have used a similar lookup table approach in the past, you may be thinking it would be a lot of work to recreate your approach in Recode. Using this example, you too can transition from a lookup table to a JMP 12 Recode script that you can reload in the Recode dialog and update to accommodate new data. In the end, you can easily recreate a new version of your lookup table from your final data table containing the original and recoded values.

I needed to structure my script to be identical to a standard Recode script, so I opened one I had saved for a related project. To save a sample Recode script, you can create some groups in an open Recode dialog window, then click on the platform’s red triangle and choose Script > Save to File.

I opened the JSL file containing the Recode script, and observed that it began with a call to begin updating the table, and a set of match statements that paired the original item name with the recoded name.

Top of script

At the end of the script, it included the column name and end to the data update.

End of script

To fill in the list of paired original and recoded names that fell between the beginning and ending sections, I created a new formula column in my lookup table containing quoted versions of the original item name and the recoded item name.

Recode Script Syntax

I copied and pasted the formula column into a script window, and then added the correct statements and variable names at the top and bottom of the script.

When I reloaded it in the Recode dialog, my script created groupings like this:

Recode reloaded script2

I could then add to existing groups, make new ones or edit the representative group names. I used a similar approach to create a version of my script for recoding items into food groupings, since I had included food groups in my lookup table and wanted to group my new items.

One gotcha I encountered was that one food item name included a double quote character (“) that I had to replace so that it didn’t interfere with the quoted strings of the item names.

I hope you'll try out the JMP 12 Recode platform if you haven't already. Whether you are starting a new project or converting an old one to use the platform, I think you will be pleased!

Post a Comment

How to stack data for a Oneway analysis

The data you want to import into JMP often requires some manipulation before it’s ready to be analyzed in JMP. Sometimes data is arranged so that a row contains information for multiple observations. To prepare your data for analysis, you must restructure it so that each row of the JMP data table contains information for a single observation. In this example, you will see how to restructure data by stacking and recoding a data table in JMP. After the data is stacked, we can perform a Oneway analysis.

The data used in this example is called Fill Weights.xlsx, which is located in the Samples/Import Data folder installed with JMP. This data represents the weights of cereal boxes produced on three production lines. The goal is to stack the data in order to compare the results of the production lines and see whether they are producing approximately the same mean fill weight. Ideally, the mean fill weight of each production line will be close to the target fill weight.

In Fill Weights.xlsx, the production lines are arranged in three sets of columns. In your JMP data table, you need to stack the data from the three production lines into a single set of columns. This way, each row represents the data for a single part.

The figure below shows the initial format of the data in Excel. The weights of cereal boxes are randomly sampled from three different production lines.

Figure 1.1 Excel Spreadsheet- Unstacked

Excel Spreadsheet - Unstacked

The ID columns contain an identifier for each cereal box that was measured. The Line columns contain the weights (in ounces) for boxes sampled from the corresponding production line.

The target fill weight for the boxes is 12.5 ounces. Although you are interested in whether the three production lines meet the target, initially you want to see whether the three lines achieve the same mean fill rate. After the data is set up properly, you can conduct a Oneway analysis to test for differences among the mean fill weights.

Import the Data

To get started, first you must import the data. Select File > Open in JMP and select Fill Weights.xlsx from the Samples/Import Data folder.

In the Excel Import Wizard preview, row 1 contains information about the table, and row 2 is blank. The column header information starts on row 3. Also, rows 3 and 4 both contain column header information. Change the settings in the Excel Import Wizard so that the column headers start on row 3 and the number of rows with column headers is 2.

Here’s what the data table looks like once you’re finished editing in the Excel Import Wizard:

Figure 1.2 Imported Data Table

Imported Data Table

The data is placed in seven rows, and multiple IDs appear in each row. For each of the three lines, there is an ID and Weight column, giving a total of six columns.

Notice that the “Weights” part of the ID column name is unnecessary and misleading. You could rename the columns now, but it will be more efficient to rename the columns after you stack the data.

Stack the Data

Reshape the data so that each row in the JMP data table reflects only a single observation. This requires you to stack the cereal box IDs, the line identifiers and the weights into columns.

To do this, select Tables > Stack to place one observation in each row of a new data table. Because you are stacking two series, ID and Line, this is a multiple series stack. In the Stack window, select the Eliminate Missing Rows option to get rid of any rows with missing data. This is the completed Stack window:

Stack Window

Stack Window

The stacked data table contains columns labeled Data and Data 2. These columns contain the ID and Weight data. Delete the Label column since the entries were the column headings for the box IDs, which you don’t need in your table.

To make the data table more understandable, rename each column by double-clicking on the column header. In this example, the columns are renamed as follows:

New column headers

New column headers

As mentioned previously, you can exclude the “Weights” part of the Line column to make the table more readable. Click the Line column header to select the column and select Cols > Recode.

Change the values in the New Values column to match those in the figure below.

Recode columns

Recode columns

After recoding and selecting Done > In Place, your new data table is now properly structured to analyze in JMP. Now, each row contains data for a single cereal box. The first column gives the box ID, the second gives the production line, and the third gives the weight of the box.

Completed stacked data table

Completed stacked data table

Conduct the Oneway Analysis

Now that your data is stacked, we can conduct a Oneway Analysis of Variance to test for differences in the mean fill weights among the three production lines.

To do this, select Analyze > Fit Y by X and assign Weight to Y, Response and Line to X, Factor. Once the plot is created, select Means/Anova from the red triangle menu.

The mean diamonds in the plot show 95% confidence intervals for the production line means. The points that fall outside the mean diamonds are not outliers. To see this, add box plots to the plot. From the red triangle menu, select Display Options > Box Plots.

Box plots

Box plots

Notice all points fall within the box plots boundaries; therefore, they aren’t outliers.

Let’s look at the All Pairs, Tukey HSD comparison results. From the red triangle menu, select Compare Means > All Pairs, Tukey HSD. In the plot, click on the comparison circle for Line C. Here are the results:

Weight by line

Weight by line

In the Analysis of Variance report, the p-value of 0.0102 provides evidence that the means are not all equal. Compare each group means visually by examining the intersection of the comparison circles. The outside angle of intersection tells you whether the group means are significantly different. If the intersection angle is close to 90 degrees, you can verify whether the means are significantly different by clicking on the comparison circle to select it.

Groups that are different from the selected group appear as thick gray circles. Notice Line C is selected and appears red (in JMP default colors), and Line B appears as thick gray. This means Line B is not in the same group as Line A, therefore their means are significantly different. The mean for Line C differs from the mean for Line B at the 0.05 significance level. Lines A and B do not show a statistically significant difference.

In addition, the mean diamonds shown in the plot span 95% confidence intervals for the means. The numeric bounds for the 95% confidence intervals are given in the Means for Oneway ANOVA report. The plot indicates that the confidence intervals for Lines B and C do not contain the target fill weight of 12.5: Line B appears to overfill and Line C appears to underfill. For these two production lines, the underlying causes that result in off-target fill weights should be addressed. Perhaps equipment needs replacing, or maybe the lines need adjusting.

With data structured similarly to the data used in this example, whatever your case might be, stacking it for a Oneway analysis is a great way to compare your results.

Post a Comment

Toy cars and DOE: The results

13 diecast cars of different colors

Find out what happens when my father and I use DOE to figure out a better way to dye toy cars. (Photo courtesy of Caroll Co)

Last time, I gave a Father’s Day tale of a father and son’s quest in dyeing toy cars. This time, I’ll share our results, but first remind you of the factors we studied:

  • Car: A/B/C/D
  • Dye type: Solid/liquid
  • Dye amount: low/high (2 Tbsp liquid/4 Tbsp liquid per half cup, or 1 tsp dry/2 tsp dry per half cup)
  • Length of time: 15 mins/30 mins
  • Dye color: red/blue/yellow
  • Vinegar: yes/no

The Response

When we were discussing this experiment, it wasn’t obvious the best way to measure how “well” a car was dyed. One thought was to photograph the cars and measure the change on the RGB scale, but this would require a lot of work with photography, and we usually had some sense as to the goodness of the coloring just by looking at a car. If we have a subjective response, the cars can all be compared side-by-side to make them relative to the others. In the end, we had three people give a forced ranking on the 17 cars with 1 being the best and 17 the worst, where the basis of comparison was how close the advertised color the car was. The final response was the average of these ranks.

The Cars

In setting up the analysis, it also became apparent that car should be a random effect instead of a fixed effect. If we dye cars in the future, we will most likely not be using the same castings used in this experiment, and we want to find a dyeing method that we can use in the future as new castings are released. When I go into the Fit Model platform, I can do this by selecting Attributes -> Random Effect with Car selected in the Model Effects list.

The Results

It turned out that it was enough to fit the main effects model for this data. I tried adding interactions, but nothing came up as warranting further investigation. It turned out that the amount of dye and length of time didn’t seem to make much of a difference, but using the solid dye showed a noticeable improvement, and that the different color of dyes have varying effectiveness. In particular, we found that the blue dye was the most effective, followed by yellow, and that it was difficult to get good coloring with red.


Final Thoughts

My father and I were both surprised at the results, in that many others dyeing these cars have been moving toward the liquid form of the dye. Admittedly, this is in part due to convenience, as mixing the liquid form is much more forgiving with splatters during mixing (a word to the wise – if you try this at home, use lots of newspaper). I think there’s still room for improvement, so I’ve started to pick up some cars for a future experiment. Any readers have experience with their own dyeing of objects? Please leave me a comment and let me know your thoughts. Thanks for reading!

Post a Comment

Seeing differently with Beau Lotto

36-lottoAfter a photo of “the Dress” went viral on the Internet this past February, JMP developer John Ponte wrote an entertaining and informative blog post, What color is The Dress? JMP can tell you!

We had shared this blog post with Beau Lotto, neuroscientist, human perception researcher, and Director of Change Lab at the University College London, before his keynote speech at our first European Discovery Summit this past March.

The Dress is still getting attention. The July/August issue of Scientific American Mind features the article Unraveling “The Dress”, in which a popular color perception example created by Beau Lotto and Dale Purves is shown to illustrate the importance of context in color perception. An interactive version of that image is available on lottolab.org.

While there is no mention of The Dress in Beau Lotto’s Discovery Summit plenary speech (until the Q&A), you can see and hear about his fascinating research on perception and innovation if you tune in to the July 15 Analytically Speaking webcast. As a result of watching his talk, perhaps you will:

  • See science as play and a way of being.
  • Welcome uncertainty (everything interesting begins with doubt).
  • Learn that innovation has two parts….

His talk is highly enjoyable for anyone who is fascinated by the workings of our brains (as I am). If you can’t catch the premier of this webcast, you can always view it from the archive once it's available there, usually by the following day.

Post a Comment

How to create an axis break in JMP

I’ve been asked three times this year about how to make a graph in JMP with an axis break. Before I show how, I want to ask “Why?” The obvious answer to “Why?” is “to show items with very different values in one graph,” but that’s a little unsatisfying. I want to know why they need to be in one graph. The advantage of a graphical representation of data over a text representation is that we can judge values based on graphical properties like position, length and slope. However, once we break the scale, those properties are no longer as comparable. We effectively have two separate graphs after all – which is actually how we can make such views in JMP.

Related to my “Why?” inquiry, I’ve had a difficult time finding a compelling real-world example to illustrate an axis break, so I made some hypothetical data. Say we have timing values for a series of 100 runs of some process. Usually, the process takes a few seconds per run. But sometimes there’s a glitch, and it takes several minutes. Here’s the data on one graph (all on one y scale).


We can see where the glitches are, but we can’t see any of the variation in the normal non-glitch runs. Some would also object to the “wasted” space in the middle of the graph. However, those aren’t necessarily bad attributes. The non-glitch variation is lost because it’s insignificant compared to the glitch times, and the space works to show the difference. Nonetheless, if our audience already understands those features of the data, we can break the graph in two to show both subsets on more natural scales.


Now we can see that the non-glitch times are increasing on some curve. The “trick” in Graph Builder is to add the variable to be split to the graph twice in two different axis slots. Then we can adjust the axes independently, perhaps even making one of them a log axis. The Graph Spacing menu command adds the spacer between the graphs to emphasize the break. It’s easier to show than explain, so here’s an animated GIF of those steps.


I skimmed a few journals looking for examples of broken axes. Here’s an example of a pattern I saw a few times for drug treatment studies where the short-term and long-term responses are both interesting. This graph is from Annals of Internal Medicine and shows two different groups’ responses to an HIV treatment.

HIV Treatment 1

Each side of the axis break uses different units of time, which fits perfectly with the idea that there are really two separate axes. One thing that bothers me about this graph, though, is the connection of the lines across the gap. Notice the difference in my JMP version:

HIV Treatment 2

With different x scales, the slopes should be different. That is, the change per week (slope on the left) should be flatter than the change per year (slope on the right) for the same transition. Fortunately, Graph Builder takes care of this for you, but it’s something to be aware of when you’re reading these kinds of graphs in the wild.

The broken line from the HIV study is an example of how an axis break can distort the information encoded by the graphic element. A more serious distortion occurs when bar charts are split by a scale break since the bars can no longer do their job of representing values with length. I’m not even going to show a picture of that. Never use a scale break with a bar chart.

When making graphs with scale breaks, make sure each part works on its own, because perceptually they really are separate graphs.

Post a Comment