JMP 13 Preview: The power of the new Virtual Join

Have you ever wanted to include data in an analysis without having to subset it from different tables and put it all together in a new table? Have you wanted to “see” how your data will come together before committing to joining many tables to make sure you get it right the first time? Soon you can, with Virtual Join!

The new Virtual Join feature in JMP 13 enables you to link a main table with multiple auxiliary tables through some common keys. Linking is done through column properties, specifically, Link ID and Link Reference. It is very automated. Once you have the properties set up, you have access to all the columns from the auxiliary tables.

The linked tables interact together as one all-inclusive data table, with all the columns from all the tables linked in virtually, without actually copying the data into the main table.

Here’s a schema showing the relationships among virtual join tables from JMP tester Mandy Chambers’ Discovery Summit poster titled “Fast, Powerful, Efficient: Joining Without Joining to Explore Summer Games Data with JMP”:

Screen Shot 2016-08-19 at 4.10.04 PMPrincipal developer Chung-Wei Ng explains how the idea for Virtual Join originally came about. Many versions ago of JMP, she was thinking of how to link multiple tables together, looking for a simple way to show related data from different tables. There already were functions in JMP like summary, linked subsets and by-groups, but she was thinking it would be more useful to generalize this.

She started exploring a different paradigm and showed some early thoughts to John Sall, the chief architect of JMP. At that time, he was working on the Choice platform, which can use multiple tables. Their conversation led to a suggestion by John: It would be nice if JMP could automatically link those tables together, so the columns are accessible through the same interface.

The idea languished until last year, when Chung-Wei got a set of related data tables on movie rentals from fellow developer Eric Hill: “When I was playing with the data tables, the idea suddenly struck me ‘Wouldn’t it be cool if I could virtually join those tables together, so all those related data from the different tables are accessible as if they were all in the same data table?’” Seeing how the data are related from those tables gave her an idea of how it should all work.

Join is one of the most-used data manipulation tools in JMP. It can be very memory-intensive, in the actual process of joining, and for the resultant table. Most of the time, you need to join data tables to bring related data into one data table through a set of common keys, so the resultant table can be used in analysis platforms. A lot of those joins can now be replaced with a virtual join.

This saves disk space, memory and time. With Virtual Join, you can keep your data in a simpler form. Related data don’t have to be duplicated in all the tables that may reference them.

Colleagues and early adopters like the functionality and ease of use. Longtime JMP user Cy Wegman, President of SY64, LLC, says: “Table management and manipulation has always been a challenge. The Virtual Join will significantly improve my productivity by decreasing errors and making table management so much cleaner.”

Chung-Wei is very happy to work on something that will be so useful and says, “I know the users will find use for it in ways that I never even dream of now. I just love working on the data table. The Data Filter lets you zoom in on the table; now Virtual Join lets you zoom out.”

To learn more about what's coming next month in JMP 13, visit the preview site. There, you can sign up to watch the live stream of John Sall's speech launching JMP 13 as well as view videos and see a list of new features.

Post a Comment

JMP 13 Preview: Association analysis for analyzing sparse categorical data

SVD Plot in JMP 13

Association analysis is a powerful way to look at categorical data that has a lot of sparsity, and it’s used not only by market researchers but also reliability engineers.

Do you find it fascinating to look at the items in other people’s grocery carts at the store? I do. In mine, you will often see orange juice, whole milk, Fuji apples and blackberries. And if I’m buying grape tomatoes, it’s likely I’m also buying parmesan cheese and basil (to make pasta with pesto for dinner that night).

Wouldn’t it be fun to analyze this kind of data – what people tend to buy together? For the first time, you can do this in JMP.

Sometimes called market basket analysis, it was originally used for analyzing scanner data from cashiers. In JMP 13, it’s called association analysis, and it was a highly requested feature.

How would you use the results of association analysis? It can help you to assist customers in finding what they may need and to sell more through effective placement of items.

“If you see that someone buys A, B and C, then the results can tell you that you should make it easier for that person to find X, Y and Z,” explains Melinda Thielbar, the research statistician developer who worked on this new capability in JMP.

Association analysis is a standard technique used in market research. But it is increasingly being used by reliability engineers to find out such things as what parts in a machine tend to break down together.

“Association analysis is a really good way to look at categorical data, especially if it has a lot of sparsity,” she says. “You may have a machine with thousands of parts, and there may a few clusters of parts that tend to break down at the same time. That’s a difficult pattern to discover without a statistical model.”

The analysis produces an SVD plot, which Melinda says she has not seen elsewhere. “It allows you to sort through all of the items that appear together as if you were doing a principal components analysis on a very large data set that describes which items appear together,” she notes.

There’s been a lot of excitement about the entire consumer research platform from early adopters, with particular interest in MDS (multidimensional scaling), association analysis and choice analysis.

“I’d like to see our customers using consumer research more, and in more areas,” Melinda says.

One of the best parts of working on this area of JMP is that she gets to collaborate with a lot of other developers. Melinda will be at Discovery Summit next month presenting a poster on new features in choice modeling in JMP and JMP Pro, as well as talking with customers informally about how they use the software to solve their consumer research problems.

For more information on what's coming in JMP 13 and JMP Pro 13, stop by our preview site.

Association analysis in JMP 13

Can you guess which of the Topics above closely resembles Melinda Thielbar’s own grocery cart when her husband is out of town? I’m not telling!

Post a Comment

JMP 13 Preview: Query Builder wrangles data better than ever

Query Builder joins data tables

Join up to 64 data tables with Query Builder in the latest version of JMP.

You may have data in a relational database that you need to bring into JMP. Your data may be spread across multiple tables. And you might want only part of it rather than all. Query Builder simplifies these types of data access tasks, and in the latest version of JMP – it is easier and more powerful than ever.

Now you can use Query Builder with more than external databases; you can use it with JMP data tables. That means you can perform multi-table queries and joins of data tables. You can easily join up to 64 tables!

Feedback from JMP users about Query Builder drove many of the other changes to this part of JMP. For instance:

  • You can bring in data from external sources and join them in-memory.
  • Windows users now get a progress bar when retrieving data.
  • New filters for categorical variables reduce wait time.
  • You can sample the first N rows of data.
  • Conditional filters let you select just the data you need.
Query Builder in JMP 13 lets you filter

When you configure your query in Query Builder, you can add filters to customize your data access.

“There are a whole set of performance enhancements that make Query Builder more frugal with resources,” says JMP developer Eric Hill, who focuses on Query Builder.

The response from customers in the JMP 13 Early Adopter program has been very positive. “They find Query Builder easier to use. And they particularly liked that they can consolidate data preparation steps and deliver a query to someone else,” Eric says.

Eric enjoys talking with users about their data access needs at conferences and customer visits. He’ll be doing that next month at Discovery Summit, where he is presenting both a tutorial and poster. There are still a few spots open for his special tutorial on Query Builder, titled “Wrangling All Your Data with Query Builder in JMP 13.”

“Being able to bring SQL into JMP is exciting. Query Builder gives our users an easier way to get their data so they can analyze it in Graph Builder or whatever platform they choose,” Eric says.

Eric will be blogging about how to use Query Builder, so look for an upcoming post on 13 tips about Query Builder in JMP 13. Until then, stop by our preview site, which has links to videos about JMP 13 and JMP Pro 13.

Post a Comment

JMP 13 Preview: Interactive HTML comes to Graph Builder

With interactive HTML reports, you can easily share the results of your analysis with a broader community while retaining the ability to interact with graphs and data. All you need is a web browser.

Interactive HTML dashboard with Graph Builder elements

Have you been asking for interactive HTML for Graph Builder? Your wait is almost over. And guess what? You can also make interactive HTML dashboards that look like this one!

Soon after interactive HTML became available for many JMP reports in JMP 11, customers began asking for interactive HTML Bubble Plots and Profilers. Once those two were added in the previous version of JMP, the development team started hearing one request repeatedly: “When are you going to do Graph Builder?”

And now they have!

“Graph Builder is the easiest way to create a graph, and it’s the front door to JMP for many customers. That’s why it was important to bring interactive HTML to Graph Builder,” says JMP developer Heman Robinson, who leads the team who work on interactive HTML in JMP.

It was the top voted feature request in the JMP User Community, and it’s here in JMP 13.

Let’s say you make a geographic map in Graph Builder and hit the “Done” button. When you save that map as interactive HTML, most of the Graph Builder features remain interactive in the report.

“We made the most of the work on Graph Builder and concentrated on the features that would benefit the most customers,” Heman says.

Interactive HTML is available in JMP 13 for the most popular features of these Graph Builder elements:

  • Points
  • Smoothers
  • Ellipses
  • Lines
  • Bar charts
  • Area charts
  • Box plots
  • Histograms
  • Heatmaps
  • Mosaic plots
  • Caption Boxes
  • Maps

You’ll hear about how to use these new capabilities from Heman and fellow developers John Powell and Josh Markwordt here at the JMP Blog.

You’ll also hear about interactive HTML for dashboards, another popular customer request. In JMP 13, you can create custom layouts for dashboards with many more kinds of graphs, as in the image at the top of this post.

It’s fun and collaborative

Customers have been excited about interactive HTML output since it became available in JMP, and the development team is equally enthusiastic about working on it. “It’s cutting-edge technology and lets us make things interactive and accessible everywhere,” says Heman.

Josh enjoys “the unique challenge of taking a desktop application like JMP and adapting it to the web.”

And John likes that they get to work with other developers, helping them and learning from them. “We get to be involved in the whole product and pick up knowledge along the way,” he says.

All three developers will be at Discovery Summit US next month and invite customers to come talk to them about interactive HTML in JMP. And everyone can explore their interactive HTML examples at our website right now.

For more information on what's coming in JMP 13 and JMP Pro 13, stop by our preview site.

Post a Comment

JMP 13 Preview: Customer-driven Graph Builder improvements

Treemap in Graph Builder

Treemap in Graph Builder is more flexible in JMP 13. That's one of many improvements to Graph Builder that were driven by customer requests.

Graph Builder is one of the most-used parts of JMP. As a result, the drag-and-drop graph creation platform receives a large number of customer requests for features and changes. The next version of JMP includes many Graph Builder improvements that directly resulted from these requests.

One big example are the changes to Treemap in Graph Builder. Treemaps are a way of visualizing multi-level categorical data in a rectangular layout. They’re useful for seeing patterns among groups that have many levels.

Treemap in Graph Builder now has better labeling and font options so that complex treemaps are much easier to read. For the smallest boxes in treemap, you can now skip labels altogether when it’s important to just see the boxes themselves.

“This means that when you want to communicate results with Treemap, it’s easy to make it look great for a presentation,” says JMP R&D Director Xan Gregg, data visualization expert and the creator of Graph Builder.

Treemap nesting, or the ability to put boxes within boxes, also has changed.

“While two nesting levels are enough for most purposes, sometimes you need more. Now, there’s no longer a limit to the number of levels you can use,” Xan explains.

Apart from Treemap, Xan highlighted a handful of other Graph Builder improvements that users will notice in JMP 13:

  • You can create a parallel plot within Graph Builder.
  • You can specify alpha level for confidence intervals.
  • You can put a legend inside a graph. (“This makes looking at a graph more direct,” Xan says.)
  • You can use a custom number format, like degrees or multipliers.
  • You can use new transform functions, including moving average, quantile, lag and random sampling.

And that list is just a small subset of all the Graph Builder improvements!

You can look forward to posts by Xan about Graph Builder in JMP 13 in the coming months. Meanwhile, you can sign up to attend his tutorial at Discovery Summit in September, titled “Creating Effective Visualizations Using Graph Builder.”

For more information on what's coming in JMP 13 and JMP Pro 13, visit the preview site.

Post a Comment

What’s coming in JMP 13? More customer-driven features

JMP 13 reports on a monitor

You'll find customer-driven features and enhancements in JMP 13, which will be launched in September at Discovery Summit.

It’s that time again! A new version of JMP will be released soon – next month, in fact, on the day that JMP creator John Sall gives his keynote speech at Discovery Summit.

What’s in this new version? There’s a lot to get excited about. That’s why the JMP Blog is featuring a series of posts based on interviews with the folks who developed the specific features. The series begins today with the big picture of this new release, courtesy of Shannon Conners, JMP R&D Director (who blogs about analyzing fitness and food data), and Dan Valente, JMP Product Manager (who has written about print-ready graphics, among other topics).

“We always try to let customers guide the features,” says Shannon, who manages the software testing and release process.

So, you can expect customer-driven features – brand-new functionality as well as some nice enhancements to existing features, she says.

“JMP 13 is all about making JMP an end-to-end solution for your data access, preparation, visualization, modeling and communication needs,” says Dan.

And as for JMP Pro, the new version helps you organize your models and selectively deploy them – converting JMP models to score code in a variety of standard programming languages.

“Throughout the entire data analysis workflow, JMP 13 reduces the bottlenecks and keeps you in the flow, so you can discover more in your data,” Dan says.

Their favorites

Among Shannon’s favorite new features is Virtual Join, which is helpful for joining up multiple data types. “It’s great because you don’t always want to permanently join data sets,” she says.

Another favorite of Shannon’s is Text Explorer, which lets you analyze unstructured text data. She also loves the enhancements to Treemap in Graph Builder and the addition of Parallel Plot to Graph Builder because they are useful in her own work.

One of Dan’s favorite features is the Dashboard Builder: “With a new set of templates, the ability to easily drag and drop JMP graphs onto a prebuilt grid, add your company logo or headline, and easily access the selection filter functionality in a single click it makes the process of organizing your results into a single window for communication with others much easier.”

Dan also likes the Query Builder for JMP Tables. Built with the same interactive query, join and filter interface as the builder for database and SAS tables, the Query Builder for JMP Tables lets you join multiple data sets, and is also an excellent tool for prototyping and generating SQL that can be used outside of JMP.

Learn more about JMP 13  and JMP Pro 13 on our website.

Your feedback needed

Shannon notes that Parallel Plot has been its own platform in JMP for years but will be a lot more apparent to users now as an option in Graph Builder.

“Graph Builder is the platform that gets the most usage by far, according to technical support staff and our early adopters,” she says.

That’s why Parallel Plot and Treemap in Graph Builder have more options than they do in their original platforms. And that’s the direction the software is going: The most flexible version of a graph is in Graph Builder, she says.

If you find that you continue to use the older Treemap and Parallel Plot platforms, Shannon wants to know.

“We’re always interested to hear when and why users turn to these older graphing platforms rather than utilizing Graph Builder,” she says.

She’s also focused on continually improving the quality of JMP. So after you have a chance to try JMP 13, let us know how existing features could be improved. Share your feedback with JMP technical support and in discussions in the JMP User Community. Both Shannon and Dan will be on the lookout for your comments.

What’s next

In the coming weeks, you’ll hear about Virtual Join, Text Explorer, Parallel Plot, Query Builder and lots more from the very developers who worked on them. Up next in our previews: a post about Graph Builder improvements with developer and data visualization expert Xan Gregg.

Post a Comment

My top 3 in JMP

A rising sophomore in college, I am nearing the end of my summer internship with the JMP marketing team. While I’ve spent previous summers doing more technical work, I was interested in learning the ways that technical knowledge could help to solve business problems. I got the chance to complete a wide range of technical and non-technical projects, but some of my favorite tasks involved using JMP to analyze marketing data. As I reflected on my summer and what I've learned about JMP, I put together a list of my top three favorites things about JMP.

1. You can JMP right into JSL

When I started my first project with JMP, I had experience only in more general programming languages such as Java and JavaScript. I expected the learning curve for JSL (JMP Scripting Language) to be similar to other languages I had learned. However, I found that within an hour or so, I was already manipulating and combining bits of JSL code. Instead of having to constantly search online for JSL syntax and methods, JMP showed me how to write the JSL I was looking for. Being able to perform an action or create a graph, and then asking JMP to save my script to a new window or file, allowed me to understand the intricacies of JSL and create my own code in no time. The JMP website has a good guide that outlines more specifically how JMP can “write your scripts."

JMPRightIn

JMP allows you to easily access the script for Distributions and graphs.

2. Cleaning data doesn’t need to be tedious

I watched Dick De Veaux, Professor of Statistics at Williams College, in an Analytically Speaking webcast and later read a quote by him that stuck with me: “Most of the time on a data project is spent cleaning and preparing the data.” Before my internship at JMP, I had never really worked with huge data sets, and the idea of cleaning and preparing data seemed less appealing than working with the graphs and analyses. After I was introduced to the Recode tool in JMP, I changed my mind about data cleanup. I had imagined having to rewrite each row of data, but JMP did a ton of that work for me, turning all the data to uppercase, lowercase, or titlecase, and grouping similar values. Of course, I still had to make sure the data matched up correctly, but JMP even made this part easier. Instead of finding myself overwhelmed by all the data, I actually enjoyed recoding my data, and my graphs turned out a lot better and simpler to interpret.

It's simple to quickly combine similar values using the Recode tool.

It's simple to quickly combine similar values using the Recode tool.

3. You can make a data table into your data table

I quickly learned that you don’t always get data tables in a form that’s ideal for analysis. Luckily, JMP has many built-in tools to manipulate your table or create a new one based on the original. For example, I wanted to compare how many people replied yes to a series of questions, but each question was in its own column, so I couldn’t create one clean Distribution. Because I had access to JMP 13, I used the new “Expanded Modeling Types” capability in JMP to create a “Multiple Response” column that I could analyze all at once. When I wanted to compare responses to different questions that had the same answer choices, I used the “Stack” table tool. Creating a summary table of different columns was also useful for seeing how basic statistics like mean and standard deviation matched up. Those tools only begin to touch on the different data manipulation tools that are available, and there are even more ways to restructure or reshape your data.

Stack tool in JMP

Turn multiple columns into one column using the Stack tool.

I am still exploring the capabilities of JMP, but as you can see, even a beginner can do a lot with JMP! My summer with JMP has been an incredible experience, and I am looking forward to expanding my JMP skills in the future.

Post a Comment

13 helpful (and lucky) JMP scripting tips

Syntax got your tongue? I've compiled 13 JMP Scripting Language tips that I've found handy while learning JSL syntax:

1. Concatenating strings (see Craige Hales' post on this)

places = {"world", "planet", "globe", "Earth"};
statement = "Hello " || places[2] || "!";

The above code creates the string "Hello planet!"

2. Referring to table column names that have special characters

Let's say your table, dt, has a column named "Prob(Pred)" (probability of prediction). If you reference it as

dt:Prob(Pred)

it won't work. You need to use quotes:

dt:"Prob(Pred)"

3. Referencing a particular file in addin (see PMroz's original solution)

To work with a file contained within your jmpaddin, you should refer to your file as:

"$ADDIN_HOME(com.mycompany.test_addin)\myfile.jmp"

4. Changing Text Box font color, size and type

myText = Text Box(statement, <<Font Color("Blue"), <<JustifyText("center"), <<Set Wrap(80), <<Set Font("Segoe UI", 15));

DemoText

(To center the text in the window, you'll need to add Spacer Boxes on either side of the text in a H List Box -- see tip #12.)

5. Saving a graph/plot in a report as an image (sample data (iris) also seen on discriminant analysis report page)

First, use the point-and-click interface and generate the model you want. After this, click the little red triangle, go to Script > Save Script to Script Window. Don't close your model window yet.

In the new script window, assign the model a name; for example, I can name my discriminant plot discrimGraph:

discrimGraph = Discriminant([...generated code...]);

Make this generated model a report:

report = discrimGraph << report;

Go back to your model window, right-click on the graph or figure you want to save, and go to Edit > Show Tree Structure. This will tell you the name of the object you want to grab (often, the HelpKey will guide you to the right box). Here, I want to grab a Frame Box:

reportBox = report[Frame Box(1)];

Finally, if you want to convert reportBox into an image that you can save:

modelImage = reportBox << getPicture();

You now have a variable name attached to the specific graph or plot that you've generated. If you want to save this as an image to your computer:

modelImage << Save Image("C:\Documents\JMP\myImage.png", png);

Often, PNG images look better, but are larger files.

DiscrimBefore Read More »

Post a Comment

Exploring 30 years of car colors

Montgomery County, Maryland, publishes all traffic violations since 2013, now totaling more than 780,000 incidents. Besides the location and details of the violations, the table also contains information about the vehicles involved, such as make, model, year and color. It’s car colors that I want to explore here.

Even though the violations go back only to 2013, the model year of the cars goes back much further. It’s hard to say how far back it goes since the year data is a bit messy. Some years are obviously missing (0 and 9999) or miscoded (95 and 1013), and others are at least questionable (1930). I want to look at trends over time, so I’ll ignore low-data years anyway.

The color data, on the other hand, appears remarkably clean. No misspelled or strange colors. There could still be quality issues since all observers may not have the same interpretation of things like blue versus dark blue or tan versus cream. Here are the colors of all the citations.

carcolors1

To look at color trends over time, I made a summary table of counts by year and color. Then I removed very sparse years, keeping 1970 – 2016. Here’s an area chart of all that data.

carcolors2

It’s starting to take shape but has some major flaws:

  • The early data is still too noisy to represent trends.
  • It’s confusing that the color of the areas don’t match the color names, resulting in a Stroop Effect in the legend.
  • The colors should have a meaning order, at least grouping like colors together.

To address those issues:

  • I filtered the data to start at 1985.
  • I used Recode to standardize the color names to match HTML color names. Then I used those RGB values to make a Value Colors column property.
  • I made a Value Order column property to customize the order.

carcolors3

Much better, and we don’t even need the legend anymore.

Though the stacked area chart in general can obscure trends of the internal areas (those without a straight baseline), it does let use see the big trends well enough. We can see cars getting less "colorful" over time, with white, black and shades of gray generally increasing. And we can see that green has had the most dramatic changes. In the 1990s, it was the most popular color. But before and after, it’s one of the least common. Was that the forest green fad? Or have green cars from the 1990s held up better than other colors?

I put green in the middle so the other colors wouldn’t have the bulge between them and the nearest straight edge (top or bottom). In the spirit of less is more, we can combine some of the similar colors together.

carcolors4

Now we have less detail but more accessible information on the general trends.

It’s interesting to compare the stacked areas to overlaid lines.

carcolors5

The dashed line is for white. We can see individual trends better, but we’ve lost the part-to-whole connotation and the ability to mentally combine adjacent colors, such as for the shades of gray.

It’s worth reminding ourselves what we're looking at. We’re not looking at a random sample of all cars sold or even all cars on the road. We’re looking only at cars involved in traffic violations in one county over a few years. Some cars are surely represented multiple times.

I can’t decide if this is the great weakness or the great strength of data science: We analyze the data that we have instead of the data that we need. It’s a weakness when we blindly extrapolate, but it’s a strength when we can characterize the unknowns enough to extract some information from the knowns.

For this data, the two main unknowns are regional differences in car colors and correlations between car color and incurring a traffic violation (deservedly or not). If we can eliminate those factors for one year, we can have more confidence in making generalizations.

DuPont publishes a survey on car color popularity, though I've only been able to find a few glimpses of the data. From those, we can see that color preferences differ around the world, so I'll only compare the Maryland traffic violations data to North America color sales.

Here's a slope chart comparing percentages from our data with the survey results, again using a dashed line for white.

carcolors6c

It looks like white, red and brown cars are showing up less often than expected in the traffic violations data. We can confirm that the differences are significant by using the Test Probabilities command in the Distribution platform in JMP. Doing that gives a p-value of practically 0 for seeing percentages this different in our population of 32,000 cars from 2009. On the plus side, at least the other colors have the same rank order in each sample, so perhaps the differences are secondary factors.

For that reason and because the most likely explanations for the differences are not related to time, I'm still hopeful that the rough time trends shown in the above charts are meaningful.

Any theories on why the white, red and brown cars show up so much less in the traffic violations than in the DuPont survey?

Post a Comment

Exploring text and other data with Heath Rushing

HeathHeath Rushing is someone I count myself very fortunate to know — first as a colleague at SAS and now as co-founder of Adsurgo, a successful consultancy.

Over years of JMP use, Heath has enthusiastically taught classes using JMP, written papers and the book, Design and Analysis of Experiments by Douglas Montgomery:  A Supplement Using JMP, and given us valuable feedback to make JMP better.

JMP 13 will be released in September, and we are grateful to Heath for his significant input on the new Text Explorer platform. A few years ago, some of our customers were wanting to do some basic text analysis, and Heath leveraged the JMP-R integration to very good effect. One of these applications is highlighted in his top-rated Discovery Summit presentation last year: "Harness the Power of JMP: Big Data and Social Media for Competitor Analytics." And he will be presenting “Mind the Gap: JMP on the Text Explorer Express” using new features in JMP 13 at JMP Discovery Summit next month.

We are pleased to feature Heath on August’s Analytically Speaking. We hope you will join us to hear about successful text analytics projects, easier workflow for basic text analytics and see a preview of some of this new capability in JMP 13.

Post a Comment