Top JMP add-ins of 2015

It's nearly the end of the year, and we are taking a look at the activity in the JMP User Community. Last time, I shared the top content among Discussions posts. Today, we have list of the most popular JMP add-ins, courtesy of community manager Stan Koprowski.


Never used a JMP add-in? An add-in extends JMP by adding new features, functions and applications. It's a JSL script that you can run from the JMP Add-Ins menu. You can share the add-in with other JMP users, and that's what we are looking at here -- those add-ins shared in the File Exchange in the User Community.

The most popular JMP add-ins, in order of number of downloads, are below. Maybe you will find that you need some of these!

Top 10 Most-Downloaded Add-Ins

  1. Full Factorial Repeated Measures ANOVA Add-In
  2. Statistical Dot Plots for JMP 10
  3. Venn Diagram
  4. Custom Map Creator
  5. Interactive Binning (V2)
  6. Text to Columns, Version 2
  7. Capability animation: standard deviation impact
  8. HTML5 Auto-Publishing
  9. Anderson-Darling Normality Test
  10. Method Comparison

While all of these add-ins were submitted by my current or former co-workers, I wanted to give some attention to the top add-ins submitted by those outside our company. A special thanks to you for contributing these add-ins to the User Community!

Top 10 Most-Downloaded Add-Ins by Non-Employees

  1. Add-In: Spatial Data Analysis
  2. JMP Addins for Frequency Analysis and Dual Seasonality Time Series Analysis
  3. ROC-Curve & partial Area Under the Curve Analysis
  4. Model Diagnostics and Transformations
  5. Scoping Design DoE JMP Add-In
  6. Window Manager
  7. Add-In to Explore the Implications of Measurement System Quality (ICC or Intra Class Correlation) and Process Capability (Cp)
  8. Icon Names Addin
  9. Univariate Binning using the Normal Mixtures Distribution
  10. Capability Analysis JMP Add-In

I look forward to seeing what new add-ins the JMP users share in 2016!

Post a Comment

Top Discussions of 2015 in JMP User Community

JMP_Discussions_topThe Discussions section of the JMP User Community is a busy place. But that isn't really a surprise because the User Community is the largest online gathering of JMP users worldwide.

I wanted to know what the most popular content in Discussions was this year, and Stan Koprowski, who is one of the User Community managers, ran some queries and found out for me.

Actually, he found out on two measures of popularity: most viewed and most liked. So I thought I'd share both lists with you. (You'll see some overlap between the two lists.) I'm hoping that at least some of the content on these lists will turn out to be helpful to you.

One of the things I found interesting as I looked at these particular Discussions was how much conversation the posts generated. For example, one post resulted in 20 follow-up questions and answers, and another had 17.

You'll notice that many of the posts on these lists have been marked "Answered" and some answers marked as "Helpful," which is great. However, many still have "Unanswered" status, although various answers have been provided. Feel free to add to the "Unanswered" posts and help close those out as "Answered."

Did your post make it on the lists? Or did you provide an answer to any of these?

Top 10 Most Liked Discussions in 2015

  1. What did you learn at Discovery Summit in San Diego?
  2. How to convert char to num
  3. Anyone know how to create divergent bar chart for Likert scale data?
  4. Control Chart & 3 Sigma
  5. How can modify the below 3 factors of 3 levels RSM script to 3 factors of 4 levels?
  6. Selecting cells with empty data
  7. Is there a way to automate/script the Winsor process of outlier filtering?
  8. Overlay Graphs
  9. How to convert character formats to date format
  10. (Four-way tie) Which Control Chart to use?
    Adding "Name contains" search bar to Column Dialog form in JSL
    DOE Vocabulary Clarification
    JMP formula to transfer format

Top 10 Most Viewed Discussions in 2015

  1. Looking for talent for the JMP Scripting Forum at JMP Discovery 2015
  2. Control Chart & 3 Sigma
  3. Wafer map
  4. Heat map, rows grouping
  5. How to convert column value labels to values?
  6. Update on JMP Scripting Forum at Discovery 2015
  7. Creating a Cumulative Summary Column for different rows
  8. New User Welcome Kit Discussion
  9. Using JMP to compare slopes and intercepts of multiple linear fits
  10. Which Control Chart to use?

Thanks for taking part, and I hope to see your posts and answers in the User Community next year!

Post a Comment

New activities and videos to enhance your JMP skills

Since it debuted earlier this year, the New User Welcome Kit has become one of the most fun and efficient ways to learn how to use JMP. Now it’s been updated with six new hands-on activities that let you apply your skills within JMP itself. The new activities include:

  • Using Graph Builder With a Local Data Filter
  • Multivariate Analysis Using Fit Model Nominal Logistic
  • Time Series Analysis Using Control Chart Builder

So whether you’ve already tried the kit or not, whether you are a beginning or intermediate user, we think you will improve your JMP skills by completing the six new activities.

The updated Welcome Kit also features three new videos, including one focused on Global and Local Data Filters in the Analyze section. It also includes a handy glossary of JMP terms and platforms.

So check out the new activities at the end of the Welcome Kit (in Practice section 11) and take your JMP skills to the next level.

Post a Comment

Analyzing a 4-factor definitive screening design with diecast cars data

cars afterWhen I created the four-factor definitive screening design discussed in my previous blog post, I was excited to try out the new technique that Bradley Jones presented at the JMP Discovery Summit. Looking at the dyed cars, I noticed some promising results and a wide array of colors.

The new technique involves fitting a main effects model and using the main effects for “fake” factors that are used to create the design but not actual changeable factors (in this case there were two, since the design was based on a six-factor definitive screening design) to provide an estimate the pure error. A more detailed analysis using this method will be saved for another day, since I had one significant effect: heat setting. However, it seemed like there was too much noise based on my past experiments, and the vinegar that was so promising in the last experiment was no longer there. I took a quick look with Graph Builder and wasn’t surprised that there was no main effect for vinegar:


This becomes even more pronounced when looking at the residuals (from fitting the heat setting and block) vs. vinegar:


Typically, when I’m fitting a model from a designed experiment, I follow the principle of effect heredity, which means I won’t add a second-order term unless the main effect component(s) is significant. However, that doesn’t seem to hold for this data. If you look at the rating vs. heat setting, and overlay vinegar, there also appears to be an interaction with heat setting and vinegar amount:


I did some other investigation with other effects, particularly with time. This is because, with the heat setting being so significant (and suggesting high heat), time should be factored in. After all the models I fit, in the final model, time was significant. I also kept the quadratic effects for time and vinegar, as they were marginally significant:


You’ll notice that time shows up as significant in the final model. The estimate itself doesn’t change from the main-effects-only model, since main effects are orthogonal to all main effects and second-order effects. However, the standard error has been reduced because the variation in rating comes from the second-order terms in the model.

Confirming the experiment

Now that I had the final model, I needed to see how it well it works. Using the Maximize Desirability option from the red triangle menu in the Profiler, it looks like 50% vinegar for 23 minutes on heat setting 3 is best.


It turns out I can dye multiple cars within a batch of liquid (hmm … that sounds like the makings of a split-plot design in the future). I used 50% vinegar with heat setting 3 and the lowest dye amount for three cars, taking them out at 10, 20 and 30 minutes. The thought was the 20 minutes should be ideal. From left to right, the cars here are undyed, 30 minutes, 20 minutes and 10 minutes:

I was expecting the 20- and 30-minute cars to look good, and I was happy with the results. In fact, I prefer the 20-minute car, as 30 minutes on high heat made the plastic in the car start to melt.

Final thoughts

While the principle of effect heredity is based on empirical studies, sometimes it doesn’t hold, and you have to start investigating if you’ve missed some second-order effects. The definitive screening design worked out nicely in this case. That's because, unlike just using center points, not only could I detect the possibility of quadratic effects, but I could also estimate them.

If you compare the results for the red cars in the first experiment, it’s incredible to see the difference after two follow-up experiments. I have a much better sense of getting other colors as well, especially with the Profiler. For example, if I want medium colors, I can use no vinegar and heat, while the lighter colors involve no heat with some vinegar. (If you missed any of these experiments, check out the whole series on dyeing diecast cars.)

I still want a definitive screening design to try out the new analysis technique. Any suggestions? Thanks for reading!

Post a Comment

Visualizing holiday food log patterns

If you read my last post, then you know that I’m giving myself the gift of data this holiday season! For me, collecting data on my diet and fitness habits is a gift that just keeps on giving. Although I may not look at all my data sets on a daily basis, the information is there when I need to use it to help me understand my patterns.

Collecting food data isn’t just a holiday habit for me, however. I have been logging the foods I eat at each meal year-round for nearly five years, using BodyMedia’s app for most of that time before adopting MyFitnessPal earlier this year. I presented a poster at JMP Discovery 2014, which described how I imported my BodyMedia food log data into JMP by saving multiple pdfs to text, concatenating them, and using regular expressions to parse details into columns in a JMP table. MyFitnessPal saved me a few steps by producing report output that can be saved as a single text file, but I still needed to use regex parsing to create a usable JMP table.

I processed the text derived from my BodyMedia and MyFitnessPal food logs using parallel but separate processes that converged at the very end in a single unified table.

I processed the text derived from my BodyMedia and MyFitnessPal food logs using parallel but separate processes that converged at the very end in a single unified table.

The big challenge: Combining data from two sources

The major challenge I recently solved was unifying these two food log data tables, and my experience solving this is relevant to anyone who has to combine data from different sources.

Minor differences like column names and format types were easy to adjust, and no matter that each table contained a few unique columns -- those simply came through with missing values for rows that didn’t have values. The main problem was that there were so few cases where item names were the same in both databases that I had to clean up the two tables using parallel but separate processes.

I already knew that my BodyMedia food log contained many instances of redundant names that referred to essentially the same food item, and this made it much harder for me to aggregate data across those items to look for patterns. I used the JMP 12 Recode platform to group similar item names into a set of cleaned item names, and described this process in an earlier blog post.

MyFitnessPal includes even more brand names and user-supplied items than BodyMedia, further magnifying this naming redundancy problem. When I switched to using MyFitnessPal to log my foods, I had tried to reuse my BodyMedia script to recode my MyFitnessPal item names. But with so few shared item names between the databases, it made more sense to create a completely separate Recode script for the MyFitnessPal project.

Consolidating food names

Now, my desire to combine my two data tables to examine my holiday eating patterns became a good motivation for discovering the most efficient way to take the cleaned item names from both tables and combine them into a single, consolidated set of item names. Of course, at the end of the exercise, I wanted to be able to save a Recode script that referred back to the original item names so I could continue to update that script as I added new items to my MyFitnessPal log.

After tweaking column names and formats for consistency, I concatenated the tables, chose the Cleaned Item Name column, and opened the Recode platform. Since I had used similar patterns to rename items in both tables, many items showed up near one another. I worked through the rest of the items, grouping similar values and making heavy use of the Recode filter field to find subgroups of related items. I recoded the full set of cleaned items names into a new column I called Consolidated Item Name.

I consolidated similar item names from my two food log data tables.

I consolidated similar item names from my two food log data tables.

To save a Recode script that I could reload into the dialog for use with new MyFitnessPal data, I used a formula column trick I shared in a previous blog post. It's a trick I discovered when I needed to create a Recode script “after the fact” to capture work I did in an early version of the JMP 12 platform before it supported saving and reloading a script into the dialog. Now, I used the same formula column approach (which makes use of the character Concat function) to create the "guts" for a Recode script to relate the original database item name to its final, consolidated name. After adding the appropriate header and footer to the script, I can reload my original item names and consolidated names into the Recode dialog and update the list when I import new food item names from MyFitnessPal.

Assigning higher-level categories

With my cleaned and complete data table in hand, I was now able to update a second data cleaning script I maintain to assign my food items into primary food categories. I have found that placing individual items into higher-level categories can be quite useful for comparing related item sets over time. I created my own classification system using groupings that made sense to me.

For example, I have a food category called ChocolateCandy, which includes dark, milk and white-chocolate containing items. I rarely eat non-chocolate candy types, so giving the category a more generic name like Candy didn't work for me. When looking at major trends over time, I'm not going to split hairs about whether it was a milk chocolate or dark chocolate truffle that I ate, but I'd like to be able to determine what proportion of my treat calories belong to the whole category.

For this blog post, I created a subset table containing just meals from a handful of selected holidays from the past five years. I used Recode to aggregate those chosen dates into a new category column called Holiday, and added a Value Ordering column property to it to lock these holidays in sequential order: New Year’s Day, my birthday, Thanksgiving, Christmas Eve, Christmas Day and New Year’s Eve.

This Graph Builder heat map shows the number of calories I logged on each holiday, which varied pretty widely. When I was actively losing weight post-baby, I ate less, and in maintenance, I tend to eat more.

This Graph Builder heat map shows that the amount of calories I ate on holiday varied, ranging from a low of 1396 (light yellow) to a high of 3580 (dark red). Days with no food log data show as white squares.

This Graph Builder heat map shows that the amount of calories I ate on holiday varied, ranging from a low of 1396 (light yellow) to a high of 3580 (dark red). Days with no food log data show as white squares.

Previously, I shared treemap visualizations of my food log data aggregated over days, weeks, months or years, but I find treemaps to also be useful for visualizing patterns in daily data in a compact way. I created a treemap in Graph Builder to show the calorie breakdown for each holiday meal during the four years where I had complete data for all six holidays.

All of this data was collected in the BodyMedia app, which provides six potential slots for logging foods-Breakfast, AM Snack, Lunch, PM Snack, Dinner, and Late Snack. I used Year as the X grouping variable and Holiday as the Y grouping variable, and colored squares by meal. Since I assigned Calories as a Y variable, each holiday’s section of the treemap is sized in proportion to the highest number of total calories I ate, which you can see occurred on Christmas 2012.

Meals vs. total daily calories

I usually log foods into meal slots by the time of day when I ate them. For example, a late breakfast or lunch would likely be logged into a snack slot. You can see this effect in the timing of some of my holiday meals. On a non-holiday, I usually log breakfast, lunch and dinner as larger meals with smaller snacks in the morning and afternoon. On major holidays, the big meal of the day sometimes falls in the afternoon snack timeframe (for example, Christmas 2012, New Year’s Eve 2011 and 2014). In those cases, I usually logged a rather substantial dessert in the dinner time slot!

Calories logged by meal depended on the timing of the big holiday meal.

Calories logged by meal depended on the timing of the big holiday meal.

Given this variability in meal timing, I actually found it more helpful to remove the meal grouping variable and look at total daily calories grouped by food category. Grouping similar items gave me a better feel for the types of foods that tend to dominate my holiday eating. It is not at all surprising that these include lots of desserts, chocolate and caloric drinks. Holidays are definitely all about social eating for me and my family!

Desserts and treats definitely dominate my holiday eating habits!

Dessert calories definitely dominate my holiday eating patterns!

Those special holiday treats

Like many people, I look forward to certain special holiday foods. I thought it would be interesting to look across my whole data set and contrast the patterns of holiday-only items with treats I eat more regularly. For example, cheesecake is one of my favorite desserts, and I eat plain cheesecake throughout the year. The bar graph below shows my monthly calorie totals, filtered to include a select few food items. In contrast, pumpkin cheesecake is a seasonal treat, usually restricted to the holiday season --  that is, unless I happen to stash an extra one in the freezer like I did last year! Also, a dish called Cranberry Delight that I make only on holidays usually only appears in my food log between Thanksgiving and Christmas, although it did make an appearance at Easter dinner in 2013.

Pumpkin cheesecake and cranberry delight are rare holiday season treats compared to plain cheesecake!

Pumpkin cheesecake and cranberry delight are rare holiday season treats compared to plain cheesecake!

So what was the biggest challenge of this blog post? You might think that it was the data collection, the recoding challenges or tweaking my visualizations to show my holiday eating patterns in the best possible light, but all those guesses are wrong. Actually, the toughest challenge was leaving the many JMP 13 updates behind and returning to JMP 12.2 to create my graphs! In fact, I blame the ongoing development of JMP 13 for making me a much less productive blogger in the second half of this year. I’ve been using development versions of JMP 13 for nearly a year now for all my work and personal projects. I am literally biting my tongue right now wanting to tell you about all the amazing new features that are coming! But you’ll just have to be patient until JMP 13 launches to hear about the new and improved additions that I can’t live without. In the meantime, happy holidays!

Post a Comment

5 more things you don't know about JMP

Previously, I shared a list of things you probably didn't know about JMP. Maybe you already knew some of them. Maybe you learned a few new things; I hope you did.

Well, now here are five more things you probably don't know about JMP.

1. Row State columns

There are two well-known data types for JMP columns: Numeric and Character. There's an oft-ignored third data type that's been in JMP since the beginning: Row State. This data type stores color, marker and other row state information for each row.

So, if you spend time coloring, marking, selecting, excluding rows and want to make sure it's saved, create a Row State column to hold on to it. Then you can color, mark, exclude, etc., all you want, and you can always get back to the stored states.

Check out this video on this subject:

2. Date/Time/Datetime columns should be Numeric

Computers generally store dates as a number of time units from an epoch. JMP stores date and datetime values as the number of seconds since 12:00 AM UTC, January 1, 1904. This is important to consider when you do math using the columns. For example, if you subtract one date value from another, the result will be the number of seconds between them.

Admittedly, the number of seconds between two dates can't be interpreted intuitively, so you'll need to divide the result by the number of seconds in the time units that you're more interested in. As an example, you divide by 86,400 seconds – the number of seconds in 24 hours – to get the number of days. JMP has a number of functions to give you these: In Minutes(),  In Hours(), In Days(), In Weeks()  and In Years() functions. The numeric argument specifies the number the time units from the function name.

The Date Increment() and Date Difference() functions will do this extra bit of math for you automatically if you want these simple calculations.

If you have date or time values stored as character strings, these two videos will show you how to convert them to numeric values:


3. Value Ordering property

 ValueOrdering_Cars.pngJMP normally orders categorical values alphabetically. So, Before, During and After will come out as After, Before, During. If you want your values in a different order, you can use the Value Ordering column property.

With this property, you can specify the order you want values to appear on axes and in reports.

Right-click at the top of a column and choose Column Properties -> Value Ordering. Then, arrange the values in the order you'd like them to appear.

4. Copy/Paste into most dialogs

 Sometimes filling in fields in a dialog in JMP can be a pain. For example, the Value Labels property requires two field entries and a click for every value/label pair.

 Consider a table of stock ticker symbols that you'd like to label with company names.

Parallels DesktopScreenSnapz001.pngParallels DesktopScreenSnapz002.png

 For each ticker symbol, you need to put the Value and Label, and then click Add.

It's easier to get the list of symbols and company names in a data table, and then copy and paste the values.

Parallels DesktopScreenSnapz003.pngParallels DesktopScreenSnapz002 4.png

 Just make sure you've got a tab-separated list on your clipboard. Similarly, you can paste column names into the launch dialog in JMP.

 5. Standardize Attributes

As data tables get larger with more and more columns, you'll appreciate a way to change the data types, modeling types and properties for lots of columns all at once. That what Cols -> Standardize Attributes… does.

Just select all the columns that need to be changed and then launch Standardize Attributes….


This makes it easy to change data that may have been imported as Character columns to Numeric. Easy peasy!

Bonus: Don't miss the Recode option at the top of the dialog. Clean up a bunch of columns all at once!

Editor's Note: A version of this post first appeared in the JMP User Community (where you can see four more things about JMP that you probably didn't know.)

Post a Comment

New book gift ideas for analytical friends and family


Need holiday gift ideas for the quantitative-minded people in your life -- or need a good read for yourself?

Having polled some of my erudite friends and colleagues as well as having finished some noteworthy books these last several months, I developed a book list that may help with your holiday shopping (even if it’s for yourself).

In preparing this list, I thought it might be useful to include relevant links to other content you may find worthwhile — related to the book and/or the author. In addition, you may want to consult two previous posts that have more recommendations, one from last year and one from 2011. Also, the recommended reading page of the Analytically Speaking webcast series (now in its fourth year) is updated regularly. Several subject matter experts featured in this webcast series — many of whom are authors themselves — also share books they recommend.

For the generally curious:

Here's what's on my ebook reader.

Here's what's on my ebook reader.

For those interested in Statistics and Visualization:


For those interested in science, nature and medicine:


And there are many more coming out early next year that promise to be good reads — a few we can anticipate: Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist, by Howard Wainer, and The Age of Em: Work, Love and Life when Robots Rule the Earth, by Robin Hanson.

If you have more recommendations, please add them in the comments section. Happy reading!

Post a Comment

Language options in JMP User Community

Real-time translation of content in the online JMP User Community is now available in multiple languages: English, French, German, Italian, Japanese, Korean, Simplified Chinese and Spanish.

For example, you can post to a Discussion in Japanese, and another user who speaks Spanish can read that post and respond in Spanish. You will view the response to your post in Japanese.

Please note that the content in the User Community displays in the language that you have set for your Internet browser. To see content in a specific language, you will need to change your browser setting to display content in that language and refresh the browser. If you use multiple browsers (such as Chrome or Firefox) to take part in the User Community, you must change the setting in each browser.

translation_communityYour current translation is shown in the User Community on the right side of the top navigation bar (see image at right).

We hope this new translation feature will mean that more people can participate more fully in the community. For more information on language support, read the announcement in the User Community.

Post a Comment

A 4-factor definitive screening design as a response surface design alternative

cars beforeI’ve been a huge fan of definitive screening designs from the moment I first read about them. I’ve also been excited by Bradley Jones' new approach for analyzing definitive screening designs.

As my second experiment with dyeing toy cars had promising results, it seemed like the right time to better explore the factors I had narrowed down – and a definitive screening design is a great way to do it.

For the purposes of rating the dyeing, I decided to stick with one color this time. With multiple colors, it was sometimes difficult to determine when one color was better than the other. I was ambitious with this experiment – the color used for dyeing was red, which was the problematic color in the first experiment. I knew that I wanted to use additional heat in this experiment based on the previous results, so the factors I was interested in were as follows:

  • Amount of vinegar: 0%-50%
  • Dye amount: 1 tsp – 2 tsp per cup of liquid
  • Time in liquid: 10 minutes – 30 minutes
  • Heat setting: 1 – 3, based on the ticks on the knob for the burner on the stove

There’s certainly the possibility of active quadratic terms and interactions in this design space.

Why use a definitive screening design?

  • The 13-run default definitive screening design for four factors is based on six factors – those two unused “fake” factors can give an estimate of the pure error assuming third-order and higher interactions are negligible.
  • The smallest design from the response surface design platform is 27 runs, doubling the amount of resources I need.
  • Fitting the full RSM model in Custom Design will require 15 runs, and the model terms will have some non-zero correlations among them, which will affect model selection. I still want to find the most important effects and not fit the full RSM model, so my preference is to use the orthogonality of effects that comes from the definitive screening design.

What orthogonality of effects?

If you’re still new to definitive screening designs, I highly recommend that you start with Bradley Jones' first blog entry and original paper, but to summarize what I get from the 13-run definitive screening design:

  • Main effects are orthogonal to each other.
  • Main effects are orthogonal to two-factor interactions and quadratic effects.
  • The model including all main effects and quadratic effects is estimable.
  • None of the second order effects are fully confounded.
  • For the 13-run design, I can fit the full RSM model for any three factors.

And now to create the design...

Once I had decided on using a definitive screening design, creating the design was simple. I went to DOE->Definitive Screening Design and entered my factors.

One last thing: I had two different cars to use that should be treated as blocks. Fortunately, the definitive screening design platform allows for the addition of blocks that are orthogonal to the main effects.

My final design setup looked like this:


Next time

I'll share my results with you in the next blog post, but I will say that it is possible to get the red to stick.

Thanks for reading!

Post a Comment

A year’s worth of key takeaways from analytics experts

analytically_speaking_graphicLast year, we gathered the highlights of almost three year’s worth of interviews with analytics experts who had appeared in the Analytically Speaking webcast series in a single video. That "best of" episode was so well-received that we decided to do it again.

After reflecting on what our accomplished guests shared this past year, we noted some important themes. We’ve curated video clips for you on these topics:

  • Collaboration and what it takes to be an effective applied analyst
  • Innovation, seeing things differently and staying in flow
  • Ongoing education for a more numerate workforce
  • Cultivating an analytical culture
  • Communicating results and including stakeholders in the analytic journey

You will hear some engaging stories from their many and varied experiences — from academicians to engineers to rocket scientists. The guests all touch on on the importance of curiosity, innovation, getting buy-in from skeptics to embrace analytics, the value of design of experiments and more. They are inspiring in their knowledge, wisdom and passion. We hope you will tune in to watch the premiere of this episode on Dec. 16.

There were so many valuable insights from these experts that it was difficult to leave some on the "cutting-room floor." If you want to hear more, you can watch the full previous episodes on demand at your convenience.

Post a Comment