Using JSL to import BodyMedia® FIT® activity monitor data into JMP

In an earlier blog post, I introduced the topic of my JMP Discovery Summit 2014 e-poster titled “Analysis of Personal Diet and Fitness Data With JMP” and shared my interests in quantified self (QS) analysis projects. For my poster project, I exported two different types of data files from the web-based Activity Manager software by BodyMedia® and wrote JMP Scripting Language (JSL) scripts to import, format and clean my data for further analysis and visualization in JMP. I hope you were able to join me to hear more at the conference in Cary, but if not, you can sign in to the JMP User Community and check out a PDF of my e-poster in the Discovery Summit 2014 section. (Membership in the User Community is free and is a great way to learn from JMP users around the world!)

Today, I’ll share how I used JSL to import and combine a set of multi-worksheet Excel files containing activity and calorie summary information. The Activity Manager software also exports PDF Food Log files, which I’ll cover in my next blog post. By the way, I have uploaded an add-in to the JMP File Exchange that you can use to import your own BodyMedia® Activity Summary and Food Log files into JMP. It also includes a bonus add-in that imports CSV files that you can download from the popular MyFitnessPal food logging website using a Chrome extension.

In early August, I exported nearly 50 Activity Summary and 50 Food Log files covering the time period from 12/20/10 to 7/28/14. Activity Summary data is saved in Excel workbooks that contain six different worksheets, but I was interested in importing data from just the first five: Summary, Activity, Meals, Sleep and Weight. You can see the details of the contents of the other worksheets in my e-poster.

Excel workbook

Most of my Activity Summary Excel files contained four weeks of data, although some covered fewer weeks. All files had column headers on line 5 and most had 28 lines of data on the first four tabs, though the number of rows on the fifth tab (Weight) varied. Before 2013, I entered weight measurements manually into the Activity Manager. In January 2013, I began to use a Withings Smart Body Analyzer scale, which uploads weight measurements wirelessly into its own web-based software and shares them automatically via a third-party integration with the Activity Manager.

I decided to script the import process after interactively importing my data files twice, about a year apart. Interactive import was more time-consuming the second time because I had even more files to work with. Since I accumulate new data every day, my number of files will always grow over time. After the second import, I formatted my columns and added formulas to the combined data table, only to compare it to the table from my first import and realize that I had forgotten several important steps.

While I backtracked to complete the steps I'd missed, I began to consider how scripting could make reprocessing all my files much simpler and faster. If the import process was easier, I would look at new data much more often. Fortunately, I found that even a novice scripter like me could write the JSL to import and combine my files.

I started with Source scripts (automatically added to JMP data tables during interactive import) and JSL examples from online documentation and discussion forums. I looped over all my files using a strategy patterned after a text file import example in a SESUG paper written by JMP developer Michael Hecht. As the longtime Mac expert at JMP, Michael is well-known for his high-quality and elegant code in all languages, so why not borrow from a master?

I ran into some snags while working to combine my data into a single unified table. One that I had to handle prior to import was that the Date column on the Sleep tab contained an overnight range rather than a single day like all the other worksheets did, preventing me from merging it directly without a preprocessing step. For example, a value of 12/20/2012-12/21/2012 indicated the night of sleep that started on 12/20/14 and ended the morning of 12/21/14. I parsed out the second day’s date using the JMP Formula Editor and the JMP Word character pattern function.

Formula editor Date

After creating the formula interactively, I added code to my script to make these steps easily repeatable with help from the JSL Get Script command. Running the command MyColName << Get Script; on the Date column from the Sleep tab printed a key snippet of JSL to JMP’s log, which I added to my script to automate this step.  In my final table, the hr:m value displayed in the Sleep column represented how much I had slept before awakening that morning.

There was extra information in my input files that I didn’t want included in my final JMP table. Calorie totals, averages and targets were summarized at the bottom of all of my worksheets (lines 36-39 in the picture of the table above). I added steps to my script to filter out these summaries, the lines of information above the column headers and also blank rows. In total, the version of my activity data table that I used for my Discovery Summit poster contained 1,316 rows of daily data on calories eaten and burned, activity measures, sleep and weight measurements.

My data table wasn’t quite analysis-ready yet. The numeric columns in my table containing durations and percentages that were not formatted correctly upon import. I added formats, missing value codes and modeling types as column properties interactively. I used the Transform column feature on my Date column to quickly add new Date Time variables like Year, Month, Week and Day to my table, and then added to my script to automate those steps. I also added a new formula column to the table (Calories Burned-Calories Consumed) to represent my caloric deficit/excess for a given day.

If you have BodyMedia® Activity Summary files saved in Excel format, you can download my add-in from the File Exchange to perform a point-and-click import of your own files into JMP. This add-in supports the Activity Summary file type I just described, BodyMedia® food log files saved as text, and also food log text files that you can export from the popular (and free) MyFitnessPal with the help of a Chrome extension.

Special thanks to JMP testing manager Audrey Shull and technical writer Melanie Drake for scripting suggestions and add-in testing help! Stay tuned for my next blog post, where I’ll describe how I automated the import of my BodyMedia® food log files using JSL.

Post a Comment

Webcasts show how to build better statistical models

We have two upcoming webcasts on Building Better Models presented at times convenient for a UK audience:

These webcasts will help you understand techniques for predictive modelling. Today’s data-driven organisations find that they need a range of modelling techniques, such as bootstrap forest (a random-forest technique) and partial least squares (PLS), both of which are particularly suitable for variable reduction for numeric data with many correlated variables. For example, some organisations deal with a multitude of potential predictors of a response, sometimes numbering into the thousands. Bootstrap forest and PLS can help analysts separate the signal from the noise, and find the handful of important variables.

Other organisations deal with the problem of customer segmentation. They may need to employ techniques including cluster analysis, decision trees and principal component analysis (PCA). Decision trees are particularly good for variable selection. Using a variety of modelling techniques can result in a different selection of variables, which can provide useful insight into the hidden drivers of behaviour.

Consumer data is notoriously messy, with missing values, outliers and in some cases variables that are correlated. Missing values can be a real problem because the common regression techniques exclude incomplete rows when building the models. This "missingness" itself can be meaningful, so using informative missing techniques to understand its importance can help you create better models. Some techniques, such as bootstrap forest and generalised regression, handle messy data seamlessly.

A critical step in building better models is to use holdback techniques to build models that give good predictions for new data, as well as describe the data used to build the models. Holding back data to validate models helps to keep the model honest by avoiding overfitting and creating a more accurate model.

Analysts face a major hurdle in explaining their models to executives in a way that enables them to do "what if" or scenario analysis, thereby exploring decisions before committing to them. A powerful way to do this is by dynamically profiling the models. Once companies have selected the best model, they often want to deploy the models to score existing and new data so that different departments can take appropriate actions.

I hope you can join us for one of these live presentations where we will demonstrate how to use these predictive modelling techniques using case studies.

Post a Comment

Getting started with risk-based monitoring

Our own Richard Zink has written extensively about the risk-based monitoring (RBM) capabilities in JMP Clinical, both on this blog and, of course, in his book.

Risk-based monitoring diagram

A risk-based monitoring process feeds data from study sites into a dashboard, which then alerts the sponsor to situations that need further investigation.

As a complement to the wealth of hands-on information that Richard has created, which primarily covers the mechanics of RBM with JMP software, we decided to publish a brief article on getting started with risk-based monitoring.

The article covers several aspects of RBM, including:

  • A basic definition.
  • Why a risk-based approach is better (hint: it's a lot cheaper).
  • Details about how the monitoring process might work in this new model.
  • An overview of the risk dashboard, a key piece of an RBM platform.
  • How to approach the transition to RBM and dealing with organizational change.

If nothing else, we hope the article facilitates some conversations in your organization as you transition to the new world of risk-based trial monitoring.

Read the guide to getting started with risk-based monitoring.

Post a Comment

Take Coursera MOOC "Teaching Statistical Thinking"

Attention teachers of statistics: Three exceptional professors at Duke University have just launched the first installment of a Coursera MOOC on Teaching Statistics: "Teaching Statistical Thinking: Part 1 Descriptive Statistics." JMP is the featured software of this course and is used in their analysis modules.

This course is designed with high school teachers in mind, but we think it is useful for any teacher of descriptive statistics. The course is organized around "core principle videos" that discuss the statistical content, along with additional videos discussing resources, pedagogy, and the analysis of data using JMP. Here is the basic outline of the course (from the course syllabus):

This class is taught in three units over five weeks

  • Getting started with data (1 week)
  • Single variable graphics and number summaries (1 week)
  • Graphics and number summaries describing the relationship between two variables (2 weeks)
  • Review (1 week)

For more information, or to join the course for free (it's not too late to join!), please visit the course site.

Post a Comment

A simple designed experiment with multimillion-dollar results

On Saturday (or Sept. 34), we marked the 25th birthday of JMP, a product I have been using since version 2. Until 2006, I was a JMP customer, even attending the first Discovery Summit conference for JMP users back in 1996. This birthday has made me nostalgic, and I wanted to share a story from my years as a JMP customer working in the chemical industry.

In 1991 at my previous employer, I completed one of my first designed experiments using JMP. That experiment was the first of many major breakthroughs enabled by using JMP and the DOE approach to scientific experimentation.

JMP Distributions

The Distributions (rerun using JMP 11) from the first DOE I did using JMP in 1991. Note that the highest-yield product was obtained when the dibromo was high, which is nonintuitive.

Our research laboratory had identified a new-generation compound. This new material would provide substantial advantages over our previous material and would give us a significant edge over one of our key competitors.

The synthesis of this compound required the use of a brominated intermediate. Although this intermediate would provide the desired cycle time in the subsequent chemical step, it was not easy to isolate using any of the conventional manufacturing isolation techniques. In addition, even when we were able to isolate it, the purified brominated intermediate produced an undesirable dimer impurity in the subsequent chemical step. This impurity caused our customer to have major mechanical difficulties in their process. To avoid causing these problems for our customer, we had to incorporate an additional purification step to produce a dimer-free product. The purification was effective, but it required the use of a solvent that was not friendly to the environment.

We decided to set up a DOE investigation where the brominated intermediate would be carried on in the process “as is” without an isolation step. The crude intermediate reaction mixture would be subjected to the second step without prior isolation and purification. This would overcome the isolation difficulties of the brominated intermediate, afford cycle time and environmental solvent saving benefits and would generate knowledge about the sensitivity of the investigated inputs on the process conversion.

We explored three factors in this non-isolation design. They were residual water and amount of brominating agent in the first step, and the reactant amount in the second step of the conversion.  This seems like a simple design in retrospect, but let’s see the ramifications of this “simple” design. The results from this investigation were both unexpected and nonintuitive. As it turns out, we obtained the best overall conversion of the final product when the intermediate was at a suboptimal level. That was a big surprise. The Prediction Profiler from that experiment is shown below. As you can see, we got optimal results when we over-brominated the starting material by 10%! Not only did those settings optimize the overall conversion (>95%), but it also totally shut down the troublesome competitive dimer reaction and provided the desired product at the highest yield.

Prediction Profiler in JMP

The Prediction Profiler shows that we got optimal results when we over-brominated the intermediate by 10%.

When we initially set up the design, the primary goal was to understand the sensitivity of the non-isolation process to varying amounts of the inputs. We were hoping for a procedure to avoid isolating the brominated intermediate because of its undesirable physical characteristics.

What we obtained, however, was much better.  We delivered a process that had a higher yield and produced a dimer-free product that eliminated the need for the purification as well. This breakthrough allowed us to save millions of dollars each year and win corporate recognition for the environmental gains that resulted.


JMP 3D graph

Thanks to our DOE using JMP, we delivered a process that allowed us to save millions of dollars a year and make environmental gains.

So here’s to the 25th birthday of JMP! I have enjoyed the wonderful journey of the evolution of this product firsthand. (And as it so happens, Sept. 34 is my birthday, too!)

Post a Comment

Statistical discovery with JMP at the 25-year point

For you, today is Oct. 4. At JMP, we call it Sept. 34. We had been determined to release the first version of JMP by the end of the third quarter of 1989. But, as it turned out, we needed a few extra days to make our own deadline. So we “extended” the quarter.

Today, JMP is 25 years old. This milestone led me to reflect on its startup, evolution, current state and future.

Most of the design principles adopted at startup remain our design principles today, though we have added a few new ones. As JMP grew from a very basic package to one with wider capabilities, we managed that growth carefully so that the product stayed agile and the product surface didn’t become too complex.

Along the way, important growth themes for JMP have included exploiting the user interface, addressing the specific needs of engineers, advancing the state of the art of fields such as experimental design, and addressing the new challenges of big data with new power in large memory and multiple cores. Central to all our work has been the graphical user interface (GUI), enabling interactive exploration, making discoveries and understanding what the data is saying.

In the last 25 years, the field of statistical methods has transformed from a small specialty to a key enabler of both advancing the technological and quality frontiers and of making better business decisions, as based on informative data.

Original Design Goals
When JMP was first released in October 1989, the original goals of JMP were:

  1. Find the best ways to exploit the emergence of the GUI point-and-click interface of the Macintosh.
  2. Find the best statistical graphics to go with each statistical method.
  3. Provide a new product that was relatively inexpensive and easy to use, and would serve as an entry-level step for SAS and for those who didn’t need a product as big as SAS.

The SAS Perspective
In the late 1980s, SAS had just completed a huge project to port SAS to personal computers, which involved translating millions of lines of code from PL/I-G to C. We had to maintain complete compatibility with previous versions of SAS, which had a large installed base of SAS language programs. Even when we took tiny steps, such as trying to require quotes in title statements, we found that we had to backtrack to complete compatibility.

So JMP became an experiment to go where SAS wasn’t playing. We wanted to explore the design space with a new product [Blue Ocean strategy]. JMP needed to be as different as possible from SAS and still serve a viable market. SAS had a programming language, so we made JMP not have a language. SAS stored data by rows, so JMP stored it by column. SAS handled big data, so JMP was limited to in-memory and originally 32,767 (i.e., 215-1) rows. SAS was a very wide, powerful product, so JMP became a much smaller, focused product. SAS aimed to sell to business, IT and applications development, so JMP aimed to sell to scientists and engineers.

The Macintosh Perspective
In 1984, the Mac appeared. By the late 1980s, Apple was starting to gain traction in the industry with the GUI, and Microsoft was investing big and embracing it with its Windows project. Everyone believed that the industry was transforming again. We had just gone through huge transitions: from mainframe MVS to mainframe CMS, from mainframe to minicomputer, from printer output to graphics output, from editing terminals to full-screen interactive terminals, and then from minicomputers to personal computers. Another upheaval with the GUI was emerging.

Programming to the GUI was supposed to be different from programming for a language interface and static output. We couldn’t just port SAS to a GUI, though we could reuse the subroutine library from SAS.

So what feature set would be big enough to serve the small market of Mac users at the time? We picked a set of typical analytical routines and put them in a very contextual Analyze menu.

The Statistics Perspective
The statistical initiatives of the time included Tukey’s Exploratory Data Analysis and the emphasis on graphics, punctuated notably by the Anscombe Quartet published in 1973. It became important to look at your data in a graph.
The American Statistical Association started a Statistical Graphics section and started publishing the Journal of Computational and Graphical Statistics. This coincided with the availability first of bitmap graphics terminals and then of graphics-enabled personal computers, so graphics became easy as well as popular. I remember the excitement of reading about the Gabriel PCA Biplot, the scatterplot matrix, brushing, rotating, mosaic plots. Pioneers included John Tukey, John Hartigan, William S. Cleveland, Michael Friendly, Lee Wilkinson. The hunt was on to design a graph to go with each statistical method. This hunt led to such JMP innovations as the general hypothesis leverage plot, comparison circles, the 3D spinning biplot and many more.

The Evolution of JMP
So with fanfare at the SAS Users Group International (SUGI) meeting in spring of 1989, we unveiled JMP on the Macintosh. This was a very small product compared to the JMP of today, but it had a nice set of initial features all linked to interactive graphics. That was just the first of many steps.

  • Early on, we found that engineers became an important customer segment for us. We added control charts, elementary design of experiments (DOE), and survival features. The Profilers became extremely valuable additions to the fitting platforms. This was also when we started hearing about the Six Sigma movement becoming so important.
  • As Windows emerged as not only a competitor to the Mac but also as the dominant host platform, we ported JMP to Windows.
  • Modern DOE came to JMP when Bradley Jones joined our team. He pioneered a long series of innovations in design of experiments, and continues to do so.
  • We completely rewrote JMP in C++ and gave it a modern display interface and a scripting language (JSL) for the fourth version of JMP.
  • We invested in data visualization, adding Graph Builder, dynamic Bubble Plots, Data Filters and more.
  • We continued to improve the application development process, with a better editor, App Builder, the debugger and script profiler, JSL namespaces, and add-ins and the JMP File Exchange.
  • We exploited interfaces to SAS. At first, this meant just providing file import and export, but later we created increasingly sophisticated direct interfaces.
  • As our user base grew, we developed various specialty areas: reliability modeling, process quality and consumer research. We also continued our investment in design of experiments where we lead the industry.
  • For many years, JMP was used mainly in the US and Japan. Now use of JMP has spread to Europe and China, where we expect to grow fast.
  • We learned to adapt to big data, making JMP faster and more resilient. We added multithreading wherever it fit well.
  • We increased our involvement in customer engagements through users groups, online communities, customer care and more.

The Future
JMP has grown from a limited product with a limited audience to a mature product with a wide audience. As the awareness of the value of analytics advances, JMP is well-poised to grow quickly in the next few years.

JMP will continue to be a desktop product, as opposed to becoming a client-server or cloud product. But the desktop has never been more capable than it is now, and we expect the desktop to remain important for many more years.

Happy birthday to JMP! JMP is 25 years old now, very healthy and hardy and still growing fast. We thank all of our users for helping us make JMP into what it is today.

JMP 1.0 Crew

JMP 1.0 Crew - Young

JMP group picture in 2014

JMP 11.0 Crew - Grown Up

Post a Comment

Design and Analysis of Experiments 2015 conference

DAE 2015 conference website

The DAE 2015 conference will take place March 4-6, 2015, at SAS headquarters in Cary, North Carolina.

A long, long time ago, I was a new statistics PhD student attending my first conference. To say I was intimidated is putting it mildly: Top researchers in experimental design and analysis from around the globe were scheduled to be at this conference. How would I be able to talk to these people whose papers I’d been reading, and would they want to talk to me?

Luckily for me, that conference was part of the Design and Analysis of Experiments conference series (DAE 2007 at the University of Memphis, to be exact). Not only does the DAE conference series attract some of the best senior researchers, a main theme is the support and encouragement of junior researchers. Junior researchers are well-represented both in invited talks and overall attendance. If you're interested, I invite you to read more about the history of the conference.

Of course, talking to other researchers wasn’t so scary after all, but that realization was made so much easier because it was a DAE conference. Fate had a part to play at the DAE 2007 as well, as it was the first time I met Bradley Jones, who eventually brought me to JMP.

I’m excited to announce that DAE 2015 is going to be held at the Executive Briefing Center on the SAS world headquarters in Cary, NC, from March 4-6, 2015. We’ve got a fantastic lineup of invited talks organized, and registration is now open. You can read more about the conference and register at

On the conference website, you’ll find that we’ve also opened up the poster submission form. The DAE conferences always have excellent posters, and next year's meeting should be no exception. If you have something you’ve been working on in the design and analysis of experiments, we would love to have you submit an abstract!

We hope to see you in March 2015! I especially encourage young researchers to attend and take part in the mentorship activities that will be revealed later. It's a great chance to meet the authors of all the papers you've been reading in your research and meet other junior researchers, all while getting to hear about new and exciting research.

Post a Comment

Discovery Summit live blog: Jonah Berger

Jonah Berger, author of the book "Contagious," gives the final keynote speech of Discovery Summit 2014 in Cary North Carolina.

View the live blog.

Post a Comment

Discovery Summit 2014 is here

Discovery Summit 2014 is on!

We have the biggest crowd ever for the conference this year. If you're not among that crowd, you can follow some of what's going on here in Cary, North Carolina, via our live coverage page.

From this one page, you can see live tweeting (follow the hashtag #jmpcon on Twitter), photos of sessions and evening events, as well as a live blog of Jonah Berger's keynote speech on Thursday.

Post a Comment

Coming to Discovery Summit? Get the mobile app

Discovery Summit 2014 starts on Monday, and there's lots going on and much to know about the conference. You can get all that info as well as interactive features in a free app for iOS and Android. For a quick overview of what the app offers, watch this video:


With the mobile app, you can get the latest agenda, messages from the conference planner, speaker information and map of the conference venue. You can also:

  • Build your own agenda.
  • Find sessions you are interested in, based on level and topic.
  • Learn about JMP developers and their expertise. You can identify the right person to talk to during Meet the Developers sessions.
  • Rate and comment on sessions that you attend. We really want everyone to do this.
  • Take notes on attendees and developers you meet and sessions you attend -- and then email your notes to yourself.
  • Find and message other attendees.
  • Create a public profile of yourself for other attendees. Click the My Account section in the app menu to add information, including a photo.
  • Earn badges by checking into sessions, and you may win a conference prize. Check the Info section in the app for Badge Game Rules.

The Discovery Summit 2014 mobile app is available for:

How to get started with the app
The app is password-protected so that only registered attendees of the conference can use it. Once you launch the app, click the Login button. You will need to establish a password. Enter the email that you used to register for the conference for this process and click the "Email Password" button. You will receive an email that will enable you to set a password. Click the link in that email and set a password. Return to the app and enter your email and password. You will only have to do this once!

See you soon!

Post a Comment