Generating control limits using Control Chart Builder

You may recall my recent blog post, Control Limits and Specification Limits: Where do they come from and what are they. Now that we understand the difference between control limits and specification limits, let's focus on control limits and the stability of a process.

Here, I will use Control Chart Builder to create a control chart, describe how the control limits are calculated and discuss how these limits can be used to make decisions about stability. Read More »

Control limits and specification limits: Where do they come from and what are they?

After talking to customers when I worked in Technical Support and more recently at JMP Discovery Summit conferences and site visits, I realized that there is confusion about the difference between control limits and specification limits. While you may have heard of, or even used both of these in your work, they are quite different from each other. In this blog post, I will explain those differences and hint at how you can use each of these limits in JMP. Future blog posts will expand on these ideas. Read More »

Simulate Responses in JMP 13 is revamped to be more useful

The Simulate Responses feature throughout various design of experiments (DOE) platforms has always been a useful tool for generating a set of responses according to a specified model. I use it frequently for the simulated responses in Fit Model (or other appropriate platforms), as a way to check that the model is being fit as expected. Prior to JMP 13, Simulate Responses had limitations:

  • Simulation was limited to linear regression models with normal errors.
  • The ability to simulate responses was tied to the DOE window and the Simulate Responses window. If you closed either window, you would have to make a new data table to simulate responses again.
  • If you wanted to run a Monte Carlo simulation study using simulated responses (that is, simulating a large number of responses from the specified model and collecting results), there was no easy way to do so using the simulated responses from the DOE platform.

Simulate Responses in JMP 13

The look and feel of the Simulate Responses dialog remains the same in JMP 13. But to address the limitations I mentioned above, some new features have been added. That's the focus of the rest of this post. Read More »

The QbD Column: Applying QbD to make analytic methods robust

In our previous blog post, we wrote about using designed experiments to develop analytic methods. This post continues the discussion of analytic methods and shows how a new type of experimental design, the Definitive Screening Design[1] (DSD), can be used to assess and improve analytic methods.

We begin with a quick review of analytic methods and a brief summary of the experiment described in that previous blog post, and then show what is learned by using a DSD. Read More »

Using Virtual Join in JMP 13 to explore adverse events data

Virtually joining data tables is a new capability in JMP 13 that can save you space and memory, while increasing your productivity in analyzing your data from multiple tables.

This new feature can help with large data tables, and save you time in trying to figure out the best way to physically join them together. Virtual joins allow you to link tables, making variables available for use together in your analysis, graphs and many other platforms in JMP.

This blog post presents an example of using Virtual Join on data from a randomized controlled clinical trial for the drug nicardipine for treatment of patients with rare, life-threatening aneurysmal subarachnoid hemorrhage (SAH). This was a clinical trial carried out from 1987 to 1989 on very sick patients experiencing bleeding between the brain and the tissues that cover it. Understandably, the patients experienced several adverse events while in the trial. It is the job of the medical monitor of the clinical trial to look at the distribution of those events. Understanding adverse event occurrence across demographic and treatment groups along with their severity and possibility of a relationship to study drug administration is a key component to meeting safety standards to keep patients safe. Read More »

Interactive HTML: Profilers in 3 more platforms in JMP 13

In JMP 12, an interactive HTML Profiler was added, as I had previously blogged about. That change mainly updated the existing Flash functionality to HTML5 technology, making it available on mobile devices like an iPad, but it also introduced a few new features. Among these was the option of exporting the Fit Model Least Squares platform report as a whole with an interactive Profiler embedded within it.

After users got to try this tool, the response was overwhelmingly positive. They found it a great way to explore cross-sections of predicted responses across multiple factors with other people who don’t have JMP yet. However, the feedback was that users would like to see Profilers available in other platforms as well.

In JMP 13, three more platforms have embedded Profilers that are available in interactive HTML. Read More »

JMP User Community redesign and relaunch is underway

If you visited the JMP User Community recently, you probably noticed that things are different. All of the "Actions" options you typically use -- like Edit, Reply, Start a discussion, Write a document, Upload a file, etc. -- are not available.

That's because the Community is being redesigned and upgraded. All content is read-only until the relaunch of the Community. Read More »

The QbD Column: Is QbD applicable for developing analytic methods?

Development of measurement or analytic methods parallels the development of drug products. Understanding of the process monitoring and control requirements drives the performance criteria for analytical methods, including the process critical quality attributes (CQAs) and specification limits. Uncovering the characteristics of a drug substance that require control to ensure safety or efficacy determines the identification of CQAs. The typical requirements of analytic methods include:

  • Precision: This requirement makes sure that method variability is only a small proportion of the specifications range (upper specification limit – lower specification limit). This is also called Gage Reproducibility and Repeatability (GR&R).
  • Selectivity: This determines which impurities to monitor at each production step and specifies design methods that adequately discriminate the relative proportions of each impurity.
  • Sensitivity: To achieve effective process control, this requires methods that accurately reflect changes in CQA's that are important relative to the specification limits.

These criteria establish the reliability of methods for use in routine operations. This has implications for analysis time, acceptable solvents and available equipment. To develop an analytic method with QbD principles, the method’s performance criteria must be understood, as well as the desired operational intent of the eventual end-user. Limited understanding of a method can lead to poor technology transfer from the laboratory into use in commercial manufacturing facilities or from an existing facility to a new one. Failed transfers often require significant additional resources to remedy the causes of the failure, usually at a time when there is considerable pressure to move ahead with the launch of a new product. Applying Quality by Design (QbD) to analytic methods aims to prevent such problems.

QbD implementation in the development of analytic methods is typically a four-stage process, addressing both design and control of the methods[1]. The stages are:

  1. Method Design Intent: Identify and specify the analytical method performance.
  2. Method Design Selection: Select the method work conditions to achieve the design intent.
  3. Method Control Definition: Establish and define appropriate controls for the components with the largest contributions to performance variability.
  4. Method Control Validation: Demonstrate acceptable method performance with robust and effective controls.

Testing robustness of analytical methods involves evaluating the influence of small changes in the operating conditions[2]. Ruggedness testing identifies the degree of reproducibility of test results obtained by the analysis of the same sample under various normal test conditions such as different laboratories, analysts, and instruments. In the following case study, we focus on the use of experiments to assess and improve robustness.

A case study in HPLC development

The case study presented here is from the development of a High Performance Liquid Chromatography (HPLC) method[3]. It is a typical example of testing the robustness of analytical methods. The specific system consists of an Agilent 1050, with a variable-wavelength UV detector and a model 3396-A integrator.

The goal of the robustness study is to find out whether deviations from the nominal operating conditions affect the results. Table 1 lists the factors and their levels used in this case study. The experimental array is a 27-4 Fractional Factorial experiment with three center points (see Table 2). The levels "-1" and "1" correspond to the lower and upper levels listed in Table 1, and "0" corresponds to the nominal level. The lower and upper levels are chosen to reflect variation that might naturally occur about the nominal setting during regular operation. For examples of QbD applications of Fractional Factorial experiments to formulation and drug product development, see the second and third blog posts in this series.

Table 1. Factors and levels in the HPLC experiment.

Table 1. Factors and levels in the HPLC experiment.


Table 2. Experimental array of the HPLC experiment.

Table 2. Experimental array of the HPLC experiment.

The experimental array consists of 11 experimental runs that combine the design factors levels in a balanced set of combinations and three center points.

What we can learn from the HPLC experiments

In analyzing the HPLC experiment, we have the following goals:

  1. Find expected method measurement prediction variance for recommended setups of the method (the measurement uncertainty).
  2. Identify the best settings of the experimental factors to achieve acceptable performance.
  3. Determine the factors that impact the performance of the method on one or more responses.
  4. Assess the impact of variability in the experimental factors on the measured responses.
  5. Make robust the HPLC process by exploiting nonlinearity in the factor effects to achieve performance that is not sensitive to changes about nominal levels.

If we consider only the eight experimental runs of the 27-4 fractional factorial, without the center points, we get an average prediction variance of 0.417 and 100% optimality for fitting a first-order model. This is due to the balanced property of the design (see Figure 1 Left). The design in Table 2, with three center points, reduces prediction uncertainty near the center of the region and has a lower average prediction variance of 0.38. However, the center points don't contribute to estimating slopes, as seen in the lower efficiency for fitting the first-order model (see Figure 1 Right).

Figure 1. Design diagnostics for 27-4 fractional factorial without (left) and with (right) three center points.

Figure 1. Design diagnostics for this fractional factorial without (left) and with (right) three center points.


Figure 2. Prediction variance profile without (top) and with (bottom) center points

Figure 2. Prediction variance profile without (top) and with (bottom) center points.

The JMP Prediction Variance Profile in Figure 2 shows the ratio of the prediction variance to the error variance, also called the relative variance of prediction, at various factor level combinations. Relative variance is minimized at the center of the design. Adding three center points reduces prediction variance by 25%, from 0.12 to 0.09. This is an advantage derived by adding experimental runs at the center points. Another advantage that we will see later is that the center points permit us to assess nonlinear effects, or lack-of-fit for the linear regression model. A third advantage is that the center points give us a model-free estimate of the extent of natural variation in the system.

At each factor level combination, the experiments produced five responses: 1) Area of chromatogram at peak (peakArea), 2) Height of chromatogram at peak (peakHeight), 3) Minimum retention time adjusted to standard (tRmin),  4) Unadjusted minimum retention time (unad tRmin) and 5) Chromatogram resolution (Res).

Our first concern in analyzing the data is to identify proper models linking factors and responses.

What do we learn from analyzing the data from the fractional factorial experiment?

Linear regression models are the simplest models to consider. They represent changes in responses between two levels of factors, in our case this corresponds to levels labeled “-1” and “+1”. Since we also have three center points, at levels labeled “0”, we can also assess nonlinear effects. We do so, as in our second blog post, by adding a synthetic indicator variable designed to assess lack-of-fit (LOF) that is equal to “1” at the center points and “0” elsewhere. The JMP Effect Summary report, for all five responses with linear effects on all seven factors, and the LOF indicator, is presented in Figure 3.

Figure 3. Effect Summary report of seven factors and LOF indicator on five responses

Figure 3. Effect Summary report of seven factors and LOF indicator on five responses.

The Effect Summary table lists the model effects across the full set of five responses, sorted by ascending p-values. The LogWorth for each effect is defined as -log10(p-value), which adjusts p-values to provide an appropriate scale for graphics. A LogWorth that exceeds 2 is significant at the 0.01 level because -log10(0.01)=2. The report includes a bar graph of the LogWorth with dashed vertical lines at integer values and a blue reference line at 2. The displayed p-values correspond to the significance test displayed in the Effect Tests table of the model report. The report in Figure 3 shows that, overall, four factors and LOF are significant at the 0.01 level (Col Temp, Gradient, Buf PH and Dim Perc) and Buf Conc, Det Wave and Trie Perc are non-significant. From the experimental plan in Table 2, one can estimate the main effects of the seven factors and the LOF indicator on the five responses with a linear model.

Figure 4 presents parameter estimates for peakHeight with an adjusted R2 of 93%, a very good fit.

The peakHeight response is most sensitive to variations in Col Temp, Det Wave and Gradient.

Figure 4. Parameter estimates of peakHeight of seven factors and LOF indicator with linear model. For improved readability, peakHeights have been divided by 1000.

Figure 4. Parameter estimates of peakHeight of seven factors and LOF indicator with linear model. For improved readability, peakHeights have been divided by 1000.

 We observe a statistically significant difference between the predicted value at the center point of the experimental design and the three measurements actually performed there (via the LOF variable).

Figure 5 displays a profiler plot showing the linear effects of each factor on all five responses.  The plot is very helpful in highlighting which conditions might affect the HPLC method. We see that Col Temp and Gradient, the two most important factors, affect several different responses.  Buf pH, Buf Conc and Dim Perc have especially strong effects on the retention responses, but have weak effects on the other CQA's.  The factors give good fits to the retention responses and to peakHeight, but not to peakArea or Res, which is reflected in the wide confidence bands for those CQA's and in high p-values for the overall model F-tests in the Analysis of Variance line of the model output.

Figure 5. Profiler of HPLC experiments with linear model.

Figure 5. Profiler of HPLC experiments with linear model.

What should we do about the nonlinearity? Our analysis found a significant effect of the LOF indicator, which points to a nonlinear effect that is not accounted for in the profiler of Figure 5.  The center points we added to the two-level fractional factorial design let us detect the nonlinearity, but they don’t provide enough information to determine what causes it – any one of the seven factors (and possibly several of them) could be responsible for the nonlinear effect on peak Height.  In our next blog, we will discuss some design options to address the problem. For now, we show what we achieved with the current experiment.

After much brainstorming, the HPLC team decided that it was very likely that the Gradient was the factor causing the nonlinearity. This important assumption, based only on process knowledge, is crucial to all our subsequent conclusions.  We proceeded to fit a model to the original experimental data that includes a quadratic effect for Gradient. The team also decided to retain only the factors with the strongest main effects for each response; for peakHeight, the factors were Gradient, Column Temperature and Detection Wavelength. In Figure 6, we show parameter estimates from fitting this reduced model to the peakHeight responses. With this model, all terms are significant with an adjusted R2 of 89%.  The root mean squared error, which estimates run-to-run variation at the same settings of the factors, is 1.754, slightly less than 1% the magnitude of peakHeight itself (after dividing peakHeight by 1000).

Figure 6. Parameter estimates of peakHeight with quadratic model. For improved readability, peakHeights have been divided by 1000.

Figure 6. Parameter estimates of peakHeight with quadratic model. For improved readability, peakHeights have been divided by 1000.

We show a Profiler for the reduced quadratic model in Figure 7.

Figure 7. Profiler of HPLC experiment with reduced quadratic model for peakHeight. For improved readability, peakHeights have been divided by 1000.

Figure 7. Profiler of HPLC experiment with reduced quadratic model for peakHeight. For improved readability, peakHeights have been divided by 1000.

Finding a robust solution

One of the main goals of the experiment was to assess the robustness of the system to variation in the input factors. We explore this question by introducing normal noise to the four factors in the reduced quadratic model. For each of the three factors, we assumed a standard deviation of 0.4 (in the coded units), which is the default option in JMP. This reflects a view that the experimental settings are about 2.5 SD from the nominal level, so reflect rather extreme deviations that might be encountered in practice.

Figure 8 presents the effect of noise on peakHeight for a set-up at the center point which was initially identified as the nominal setting.  We can compute the SD of the simulated outcomes by saving them to a table and using the “distribution” tab in JMP. The SD turns out to be 2.397 and is slightly larger than the run-to-run SD that we computed earlier of 1.754. The overall SD associated with the analytic system involves both of these components. To combine them, we need to first square them, then add them (because variances are additive, not SD) and then take a square root to return to the original measurement scale. The resulting combined SD is 2.970, so the anticipated variation in factor settings leads to an SD about 70% larger than the one from run-to-run variation alone.  The overall SD is less than 1.5% of the typical values of peakHeight and that was considered acceptable for this process.

Is our nominal solution a good one for robustness?

Figure 8 is very helpful in answering this question. The important factor here is Gradient, through its non-linear relationship to peakHeight. The “valley” of that relationship is near the nominal choice of 0. Our simulation of factor variation generates values of Gradient that cover the range from -1 to 1. When those values are in the valley, they transmit very little variation to peakHeight. By contrast, when they are near the extremes, there is substantial variation in peakHeight. So the fact that the bottom of the valley is close to the nominal setting assures us that the transmitted variation will be about as small as possible. We can test this feature by shifting the nominal value of Gradient. When the nominal is -0.5, the simulator shows that the SD from factor variation increases to 4.282, almost 80% more than for the nominal setting at 0.

The dependence of peakHeight on Col Temp and on Det Wave is linear. So regardless of how we choose the nominal settings of these factors, they will transmit the same degree of variation to the peakHeight output. The experiment lets us assess how they affect robustness, but does not provide any opportunity to exploit the results to improve robustness.

Figure 8. Prediction Profiler of peakHeight when the factors are at their nominal settings and the natural SD is 0.4. For improved readability, peakHeights have been divided by 1000.

Figure 8. Prediction Profiler of peakHeight when the factors are at their nominal settings and the natural SD is 0.4. For improved readability, peakHeights have been divided by 1000.

Going back to the original questions

In reviewing the questions originally posed, we can now provide the following answers:

1. What is the expected method measurement prediction variance for recommended set ups of the method (the measurement uncertainty).

Answer: We looked at this question most closely for peakHeight, where we found that the SD is 2970, with roughly equal contributions from run-to-run variation and from variation in the measurement process factors.

2. What setup of the experimental factors will achieve acceptable performance?

Answer: With all factors at their nominal settings (coded value at 0), the SD of 2970 is less than 1.5% the size of the values measured, which is an acceptable level of variation in this application.

3. What are the factors that impact the performance of the method in one or more responses?

Answer: The three factors with highest impact on the method’s performance are gradient profile, column temperature and detection wave.

4. Can we make the setup of the experimental factors robust in order to achieve performance that is not sensitive to changes in factor levels?

Answer: We saw that we can improve robustness for peakHeight by setting the gradient to its coded level of 0 (the nominal level in the experiment).  That setting helps us to take advantage of the non-linear effect of gradient and reduce transmitted variation.

5. Can we assess the impact of variability in the experimental factors on the analytical method?

Answer: As we noted earlier, the natural variability of the input factors is responsible for slightly more than half the variation in peakHeight.


In reviewing the questions originally posed, we first fit a linear regression model. After realizing that there is an unaccounted nonlinear effect we used a reduced quadratic model and found that it fits the data well. Inducing variability in the factors of the reduced quadratic (gradient profile, column temperature and detection wave), we could estimate of the variability due to the method and could assess the robustness of the recommended setup.

The team’s assumption that gradient is responsible for the non-linearity is clearly important here.  If other factors also have non-linear effects, there could be consequences for how to best improve the robustness of the method. We will explore this issue further in our next blog post.


[1] Borman, P., Nethercote, P., Chatfield, M., Thompson, D., Truman, K. (2007), Pharmaceutical Technology.

[2] Kenett, R.S, and Kenett, D.A (2008), Quality by Design Applications in Biosimilar Technological Products, Accreditation and Quality Assurance, Springer Verlag, Vol. 13, No 12, pp. 681-690.

[3] Romero R., Gasquez, D., Sanshez, M., Rodriguez, L. and Bagur, M.  (2002), A geometric approach to robustness testing in analytical HPLC, LCGC North America, 20, pp. 72-80,


About the Authors

This blog post is brought to you by members of the KPA Group: Ron Kenett and David Steinberg.

Ron Kenett

Ron Kenett

David Steinberg

David Steinberg

13 reasons data access is better than ever in JMP 13

For most of us, the data we analyze in JMP starts out somewhere else: in a relational database, Excel, a CSV file or perhaps SAS. The need to seamlessly move such data into JMP and prepare it for analysis led us to introduce the Query Builder feature in JMP 12. Query Builder helps you select multiple tables from an external data source and join them. Then, you can interactively filter (creating a prompting filter if desired), sample and set column names, formats and modeling types for the imported data.

The feedback we’ve gotten from users about Query Builder suggests that you are finding it useful. We have also gotten suggestions for fixes and enhancements, both for Query Builder and other aspects of data access. With JMP 13, we are delivering a boatload of such fixes and enhancements. The 13 most important such fixes and enhancements are detailed below.


The first four enhancements all relate to filtering data.

Careful, my data could be huge – When you create a filter for a categorical column, Query Builder retrieves the values to display in a list. With large tables, this can take a long time. In JMP 12, value retrieval was unconditional, plus there was not a way to cancel it. In JMP 13, we have made several changes to prevent long waits:

  • Cancelable value retrieval – JMP 13 puts up a progress bar with a Cancel button when retrieving categorical column values. This is supported for all ODBC drivers we have tested when JMP is running on Windows. It is not supported when connecting to SAS or for most ODBC drivers available for the Macintosh.
  • Too big to attempt – If there are more than 1,000,000 rows in a table, JMP will not even attempt to retrieve unique column values. The 1,000,000 value can be changed via a preference.
  • Simpler list – In JMP 12, the Check Box List was the only type of filter available for selecting from a list of values. In JMP 13, we have added a plain List Box filter type. The List Box filter is less resource-intensive than the Check Box List filter. This makes it better-suited for larger lists. The default filter type for categorical columns is the List Box in JMP 13.

New filter types – In addition to the new, simpler List Box filter type, two more filter types have been added for categorical columns in JMP 13:

  • Contains filter – Enter some text, and JMP will match all rows that contain that text. You can also ask to match rows that do not contain the text.
  • Manual List filter – Allows you to create a list of selections yourself to avoid the need for values to be looked up.

List filters are now invertible – All of the list-type filters (List Box, Manual List, Check Box List and Match Column Values) now have a Not in List check box. This allows you to select a couple items and retrieve all rows that do not match the selected values. For example, this filter will return all movies rated something other than “G”:

List filters can now be conditional – This one is sort of a big deal.  Using the red-triangle menu on a list-type filter, you can set the filter to be conditional. Conditional filters only display values that match other filters that precede them in the list. Below is an example using movie Rating and movie Genre. In this example, I have asked for the Genre filter to be conditional. When I select G in the list for Rating, the Genre filter changes to list only genres that contain at least one G-rated film:


This symbol indicates that the filter is conditional. Only filters for columns from the same table affect the values displayed in a conditional filter.

JMP Query Builder

After using Query Builder, some users would ask us, “What if I just have a folder full of JMP data tables. Can I use Query Builder on them?” In JMP 13, the answer is a resounding “Yes!” Or perhaps you use ODBC Query Builder, Text Import or the Excel Import Wizard to import several tables. It would nice to be able to use Query Builder to join the results. With JMP 13, you can!

To use Query Builder on JMP data tables, first open the tables, and then select JMP Query Builder from the Tables menu.


For example, JMP has two sample data tables, and, that both have State and Year columns. Another sample table, US, has a State column. I can easily join these three tables with JMP Query Builder:


JMP Query Builder allows up to 64 tables to be joined together. If you ever get that many tables into one query, please send me a screenshot.

All of the other features of Query Builder, such as filters and prompting, are also available with JMP Query Builder.

Query() JSL Function

We built a SQL engine into JMP to allow Query Builder to work on JMP data tables. A new JSL function, Query(), gives you direct access to that SQL engine. You can use the Query() function to manipulate JMP data tables using SQL statements. Here is an example using SATByYear and CrimeData sample data tables:

Run on Open

In JMP 13, you can configure a query to immediately run when you open it instead of opening the Query Builder window. Simply check the Run on Open option on the red-triangle menu at the top of the Query Builder window:

This is especially useful for queries that have prompted filters. You can send these queries to others (or incorporate them into a JMP add-in), and when the other user opens them, they will just see the filter prompt. This allows them to make their filter selections without having to wade through the complexities of Query Builder.

When a query has been set to Run on Open, but you need to open it into Query Builder to make changes, you have a few options. If you hold down the Ctrl key while opening the query, it will open into the Query Builder window. Alternatively, you can right-click on the query file in the JMP Home Window and select Edit Query.

Creating queries that will work in JMP 12

One caveat to all these neat new JMP 13 Query Builder features – if you create queries that use these features, you will not be able to open them in JMP 12. At the same time, you may get JMP 13 earlier than your co-workers so that you can try out other new features.

To help with this scenario, we have added a preference in JMP 13 that hides all of the new JMP 13 features of Query Builder so that the queries you build will still be compatible with JMP 12. The preference is on the Query Builder Preferences page:

Any ODBC or SAS queries you build after setting that preference will only allow features that are compatible with JMP 12. If you want to relax that rule for a particular query, there is an option on Query Builder’s red-triangle menu that you can uncheck to allow JMP 13 features for that query:

New features on the Tables panel

The Tables panel on the Query Builder window in JMP 12 did not have much functionality other than showing you the list of tables in your query.  In JMP 13, that panel gains a number of features:

  • Selecting one or more tables in the Tables panel restricts the columns listed in the Available Columns panel to just columns from the selected tables, making columns easier to find.
  • The Tables panel now displays the Venn diagram icon corresponding to the join type for each table, and you can edit the join, change the table alias, or remove the table from the query from the context menu.
  • When querying JMP data tables, double-clicking a table in the Tables panel makes the table visible and brings it to the front (or select the View item on the context menu).

“First N Rows” Sampling

When querying large tables from databases, sometimes it is helpful to retrieve just the first thousand or so rows of data for a query to experiment with before you spend the time and resources to retrieve all the data.

In JMP 12, First N Rows sampling was supported for the Oracle and SQL Server databases. In JMP 13, support has been added for most other databases, including PostgreSQL, MySQL, Microsoft Access, SQLite, Apache Hive, and Cloudera Impala.

Improved Hadoop and Text File Support

More and more data is being stored in “big data” databases these days. JMP 13 improves date support for sources like Apache Hive, Cloudera Impala and HortonWorks. Also, saving tables with File > Database > Save Table did not work well with some of these data sources. That has been improved in JMP 13, with the caveat that using ODBC to save data to Hadoop-based data sources is not a very efficient way to get data to them.

If you do a lot with CSV files, support for the Microsoft Text Files ODBC driver has been improved in JMP 13.

Saving JMP data to a database is much faster

Keeping data in a database makes it convenient to provide access to whoever needs it. For many releases, JMP has supported saving JMP data tables to databases via the File > Database > Save Table feature. However, with data sizes getting larger and larger, we have had reports that saving JMP tables to a database was taking much longer than people felt that it should. We listened and investigated, and we are happy to report that, in JMP 13, the performance of saving JMP tables to databases has significantly improved, in some cases dramatically. Please try this feature again and let us know what you experience.

Virtual Join

With JMP, all of the data you are analyzing has to fit in memory. When you join JMP data tables with either Tables > Join or the new JMP Query Builder, data tends to get duplicated from smaller “look-up” tables into the larger join result. To help prevent this duplication, the Virtual Join feature has been added in JMP 13. For example, a DVD store might have an inventory table that knows where all the DVD’s are and a film table with details about each title. In the film table, I can set the film_id column to be the Link ID for the table:

Then, in the inventory table, I can set film_id to be a link reference to the film table. This action effectively joins the two tables based on the film_id column.

Once I’ve set that up, columns from the film table now appear in the column list for inventory. They are designated "referenced columns" and are initially hidden. I can unhide whichever columns I want to appear in the inventory table, in this case title[film_id]:

Virtual Join allows me to see the values from the film table in the inventory table. However, they have not been physically copied. They are looked up as needed, which saves memory.

This just scratches the surface of Virtual Join, which is worthy of a blog post all on its own.

So, there you have it – a look at the many enhancements for accessing and manipulating data in JMP 13. Which feature is your favorite? What feature were you hoping to see that was not mentioned? Let me know in the comments.

For more information on using Query Builder for JMP data tables, check out my Discovery Summit poster presentation in the JMP User Community. While you're there, you can also see the slides from my Discovery Summit tutorial titled, "Wrangling All Your Data With Query Builder in JMP 13."

Formulation success: Getting the right data in the right amount at the right time

9781629596709_frontcover We want to help scientists and engineers be successful at developing formulations quickly and efficiently. Success requires good strategies to get the right data in the right amount at the right time. That's why we published the book Strategies for Formulation Development: A Step-by-Step Approach Using JMP.

We have worked with formulation scientists and engineers for decades and have seen many different types of formulation development programs. This has shown us what formulation scientists really need to know rather than what is nice to know. Because JMP data analysis software is used in the examples in the book, readers get valuable guidance on the software for the proposed methodology. That means JMP users can immediately apply what they learn in the book.

Key takeaways from the book include:

  • Approach the development process from a strategic viewpoint, with the overall end in mind. Don’t necessarily run the largest design possible. An experimentation plan that implements the strategy provides the right road map for developing a successful formulation.
  • Focus on developing understanding how the components blend together. Use designs and models that help find the dominant components, components with large effects, and components with small effects.
  • Use screening experiments early on to identify those components that are most important to the performance of the formulation. This strategy creates a broad view and helps ensure that no important components are overlooked. It also saves significant experimental effort.
  • Analyze both screening and optimization experiments using graphical and numerical methods, which is easily done with JMP. The right graphics can extract additional information from the data.
  • Consider integration of both formulation components and process variables in designs and models, using recently published methods that reduce the required experimentation by up to 50 percent.

This is how you speed up the formulation development process and produce high-quality formulations in a timely manner. Upcoming blog posts will show how to address each of these important issues.

Want more information? You can read a free chapter from the book and learn about authors Ronald D. Snee and Roger W. Hoerl.