A process for statistical discovery with JMP

When working with users new to JMP, I find it helpful to have a simple process to guide statistical discovery. We statisticians could debate the process of statistical discovery for a long time, but I find the process presented in Figure 1 works for most situations.

Figure 1: Process of Statistical Discovery

Figure 1: Process of Statistical Discovery

Assuming we have already defined our goal, statistical discovery starts with accessing the data relevant to our goals, which may involve design of experiments (DOE) if existing data is unable to answer our questions. Managing data requires us to check the data for errors, get it in the required row-column structure for statistical analysis and possibly derive new variables.

The analysis step involves a mixture of exploration with dynamically linked graphs followed by fitting and evaluating a model of some kind. I find it best not to be too prescriptive about the path to take through the analysis step, and beneficial to try many different approaches to facilitate a train-of-thought analysis approach where I generate new ideas to investigate based upon prior visualisations and analyses.

Having said that I do have favourite JMP methods. For example, in the case where I want to model one or more Y’s (responses) as a function of several X’s (predictors or factors), I often use Distribution, Graph Builder, Data Filter, Variability Chart, Partitioning for standard decision trees, Partitioning for random forests (when I have many correlated X’s to investigate), Fit Model, Custom Design, and Profiler/Simulator. I will try several different approaches to exploring and modelling data before selecting a model on which to base decisions. In the reporting stage, I find the Profiler an invaluable decision and communication aid with key stakeholders.

Figures 2 to 5 map this process to JMP as the enabling technology for statistical discovery. Figure 2 helps new users understand how to get their data into JMP quickly using the capabilities of the File menu. This may result in one or more tables that require further manipulation using the capabilities illustrated in Figure 3.

Figure 2: Data Access Functions

Figure 2: Data Access Functions

Figure 3 indicates functionality available from the Tables menu to get a single combined view of data in the right shape for analysis. The process of data management includes utilising the distribution platform from the Analyze menu to visually identify data errors and outliers. Validation and Recode functions from the Tables menu enable rapid data corrections, and the Missing Data Pattern feature helps you understand the extent of, and any recurring patterns to missing data. The formula editor allows derivation of new variables as a function of other variables and column properties defines descriptions of variables (or metadata) to speed later analysis.

  Figure 3: Data Management Functions

Figure 3: Data Management Functions

Figure 4 indicates the analytical functional groups (or application areas) of JMP. These cover a wide variety of dynamic visualisation and analysis capabilities that facilitate the various goals of research, design, discovery, development, engineering, production, sales, marketing and support users across a variety of industries. The Data Filter allows us to zoom in/out to investigate insights for train of thought analysis and the Profiler and simulator help us understand and interpret the knowledge captured by our analyses.

Figure 4: Visualisation and Analysis Functions

Figure 4: Visualisation and Analysis Functions

Figure 5 indicates various reporting aids to facilitate communication to other stakeholders around the knowledge gained and best decisions. JMP provides interactive reports that facilitate what-if (train of thought) visualisation and querying of models to aid discussions in real time during meetings, which is important to gaining buy-in as to the best decisions to be made. Additionally the information in JMP reports can be output to Office applications to facilitate presentations and printed reports. Other reporting options include Flash, html, eps and pdf.

Figure 5: Reporting and Communication Aids

Figure 5: Reporting and Communication Aids

This framework is useful in helping new users quickly become productive in analysing their data, extracting knowledge from their data and making better informed decisions using JMP. Depending upon their goals, new users can get started with JMP by gaining familiarity with the File and Tables menus in addition to one or more of the analytical functional areas indicated in Figure 4.

As an alternative to interactive use of JMP, analytical applications can be developed and then deployed to end users with the goal of providing access to specific data and analysis methods to simplify and speed their analysis.

tags: Data Visualization, Design of Experiments (DOE), Discovery, Exploratory Data Analysis, JMP - General, JMP Pro, Modeling, Statistics

6 Comments

  1. Ibrahim Abubakar
    Posted October 4, 2011 at 10:37 am | Permalink

    This is a nice approach.

  2. Kun Wang
    Posted October 6, 2011 at 12:58 pm | Permalink

    Malcolm, thanks for your blog.
    Can you again send me more information on how to make JMP reports can be output to Office applications to facilitate reports. If my MS word template want to use graphs from JMP report,is there a quick and simple way to realize?
    Thanks in advance.

    • Malcolm Moore Malcolm
      Posted October 13, 2011 at 4:27 am | Permalink

      Save your JMP reports as Word files using File > Save As, this will output all graphs (and tables) within your JMP report to MS Word.

  3. Martin Owen
    Posted November 1, 2011 at 1:01 pm | Permalink

    Malcolm, I liked this alot. It made me think it would be useful to see something more detailed around the data modelling - selection of design , selection of analysis method, augmentiation etc

  4. Posted December 1, 2011 at 4:12 pm | Permalink

    Thanks so much for giving everyone an extraordinarily breathtaking opportunity to read articles and blog posts from here. It's always so pleasing plus jam-packed with amusement for me personally and my office acquaintances to visit your website not less than thrice per week to study the newest things you have. And lastly, I'm just actually satisfied considering the eye-popping methods you give. Certain two ideas in this posting are surely the simplest we have all had.
    Bumper Stickers,Bumper Sticker

One Trackback

  1. [...] a previous blog article, a high-level process for statistical discovery was presented. Now we go much deeper into the first [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>