Graphs: Comparing R, Excel, Tableau, SPSS, Matlab, JS, Python, and SAS

9

Are you a visualization & graphing expert? Can you identify which tool (R, Excel, Tableau, SPSS, Matlab, JS, Python, or SAS) was used to create each of these graphs? No cheating!

I recently read Tim Matteson's blog where he presented 18 graphs, and had his readers try to guess which software was used to create each of them. I thought it was an interesting exercise, but I was a little disappointed in the graphs. My buddy Paul Kent said I should create my own new/improved version of each graph, and I thought that sounded like a splendid idea! Be sure to click the link above to see the original versions, so you can better appreciate the improvements.

Can you determine which software I used to create each of my improved versions? (leave your guesses in the comments section)

Chart 1

The biggest problem in the original graph, was that the colors and order of the bar segments didn't make sense - seems like they should be bad-to-good, but the original graph had them in alphabetical order. Also, the Xnn labels along the left-side axis were cluttered and difficult to read. In my version I spaced the labels out more, and also left-aligned them so the 'X's lined up and made them easier to read.

likert

Chart 2

In the original chart, having a colored area behind the questions made it look (at first glance) like those were bars, therefore I didn't color that area in my graph. I was a bit confused by the numbers to the left and right of the bars in the original, therefore in my version I color-coded these numbers so the user would know at-a-glance that the left number represented 'disagree' and the right number represented 'agree'. In survey data like this, I think it's important to be able to see whether over 50% of the respondents agree or disagree, so I added a reference line at 50%.

book_survey

Chart 3

In the original chart, they had the axis labels along both the left and bottom, showing each label twice. In my plot, I placed the label along the diagonal boxes, allowing me to only show each label once (and also eliminating the sideways labels along the left axis). I used transparent plot markers, so you can see where markers are stacking. I also use a different color marker from the axes and text, so the markers stand out more.

crime_rate

Chart 4

The original chart used so many grid lines that I found it difficult to follow a line to the axis. I used years rather than months along the x-axis, because that seemed easier to understand for such a long time period (quick - how many years is 70 months!?! see what I mean!)

recession_job_losses

Chart 5

For this one, I left it pretty much as-is, except I placed the labels inside the longer bars (rather than outside), thereby making more room for the bars. I also explain what 'cola' is in the title, since it's an acronym most people probably aren't familiar with - wouldn't want people thinking this was a graph about soft drinks!

cola

Chart 6

For this chart, I didn't have the original data, so I decided to go with some data that was similar, but less dense. I'm not sure what the original chart was trying to show, but I can't imagine it was doing a very good job of it (looked like a cluttered mess of points & lines to me).

points_lines_3d

Chart 7

In the original chart, I don't think the circles showed up very well against the black background - therefore I didn't put any circles on my version (if you want to see a black map with circles, have a look at my map with animated circles). Be sure to click on this one, to see the full size map (to get the full effect)!

earth_at_night

Chart 8

The original chart was a simple scatter, with '+' markers, and dark grid lines. In my version, I used transparent round markers - this way you can see when multiple markers are stacked in the same location. I also use light grid lines, so the grid doesn't compete with the markers for your attention. I also added some summary statistics in the top/left corner of the graph.

scatter

Chart 9

I'm not a big fan of using black backgrounds in a graph ... but if you're going to create any kind of graph, at least show the scales along the sides!

curves

Chart 10

This is another one I didn't have the exact data for, so I used some similar data. The biggest change I made was using transparent markers so you can see where multiple markers are stacked on top of each other. I also use a grid of reference lines from both axes, rather than just one axis.

random_scatter

Chart 11

Although the original chart didn't have any labeling, I suspect it was some of Fisher's classic iris data set, therefore I used some of that data in my chart. The first improvement I made was labeling the graph, so you quickly know what I'm plotting. I also annotate a picture of a labeled iris flower, so you know what a petal and a sepal is.

iris_flower

Chart 12

I'm not a big fan of using 3d bars on a 3d map to show data, like they did in the original graph - the taller/front bars inevitably obscure some of the shorter/back bars, etc. Therefore in my graph I show how to plot data as markers on a 2d street map.

west_nile_chicago

Chart 13

In the original chart, I'm not sure exactly which year(s) of earthquake data they use, since there is no title or label. In my chart, I show all the major earthquakes for a 40+ year time period, and I also center my map on the Pacific ocean (so it better shows the 'ring of fire'). I also use circles rather than filled dots, so it's easier to see almost-overlapping markers.

worldquakes_recent

Chart 14

In charts like this, I really don't like when people use a diverging color scheme (gradient shades of 2 colors, meeting in the middle) - those should be used when the scale goes from bad-to-good, etc. In this case, where the colors represent a simple "Percent of Trials" gradient shades of a single color should be used. They left-justified their Cancer Conditions, which placed them far from the chart, and made it difficult to see which colored blocks went with which label - I right-justified them. Also, it was difficult to determine whether white boxes were light gradients, or no-data. In my chart, I use a hatched pattern for no-data, to make the distinction more obvious.

And in the bottom (bar) chart portion, I was a bit confused by the numbers on top of the bars - after a bit of scrutinizing the graph, I found that the numbers represent the difference in the Actual and Expected time. Therefore I tried to make that more obvious in my bar chart.

cancer_intervention_cap

Chart 15

I don't really have access to any software to do solid-modeling, so instead of doing an animation of a solid-model of the earth (which looked pretty pitiful in the original blog), I am using a different animation. Click the image below to see it animated:

gapminder_cap

Chart 16

For this chart, my version is a little cleaner, and I've moved a few of the labels to new locations.

shoe_sales

Chart 17

The original chart had somewhat willy-nilly axis tick marks, and I wasn't real keen on using circles in the legend to coincide with the lines in the graph. I didn't have this exact data, therefore I chose some similar time-series data that I could show three lines overlaid. Notice that in addition to the color legend, I also added a label to the end of each line.

population_graph

Chart 18

For this one, I used slightly different colors, and slightly larger/bolder text, but aside from that it was already a great graph. :-)

catalyst_3d_surface

 

Ok - time to enter your guesses in the comments section! Which software(s) were used to create which graphs?

After making your guesses, you can scroll down to find the answer! ...

Note: I used SAS/Graph and SAS ODS Graphics to programmatically create my graphs. If you'd also like to see how similar graphs can be created using SAS Visual Analytics point-and-click interface, check out Cindy Wang's blog post!

 

 

 

 

 

 

keep scrolling ...

 

 

 

 

 

 

 

queue dramatic music ...

 

 

 

 

 

 

Chart 1 - SAS

Chart 2 - SAS

Chart 3 - SAS

Chart 4 - SAS

Chart 5 - SAS

Chart 6 - SAS

Chart 7 - SAS

Chart 8 - SAS

Chart 9 - SAS

Chart 10 - SAS

Chart 11 - SAS

Chart 12 - SAS

Chart 13 - SAS

Chart 14 - SAS

Chart 15 - SAS

Chart 16 - SAS

Chart 17 - SAS

Chart 18 - SAS

Yep, I used SAS to create all 18 of these charts!  And if you'd like to see the SAS code, I've set up an examples page.

 

Share

About Author

Robert Allison

The Graph Guy!

Robert has worked at SAS for over 25 years, and is perhaps the foremost expert in creating custom graphs using SAS/GRAPH. His educational background is in Computer Science, and he holds a BS, MS, and PhD from NC State University. He is the author of several conference papers, has won a few graphic competitions, and has written a book (SAS/GRAPH: Beyond the Basics).

9 Comments

  1. Michelle Homes

    Impressive!

    I saw Tim's blog and thought... I wonder what Rob could do and wow... fabulous! Awesome work. Explains why there hasn't been any blog posts from you this week

    Super Christmas SAS graph gift!

    • Robert Allison
      Robert Allison on

      I've looked at the statpedia website several times - there are some interesting graphs there, but I'm not a big fan of the way the data is graphed. In particular, the graphing software "adapts" the graph to fit the size and proportion of the screen, and many times that produces really bad graphs, that don't communicate the data well.

      As an expert graph designer, I spend a lot of time getting the size/layout/proportions/labels/etc of my graphs "just so", and if a graphing software changed all that when the user resizes their screen, then the graph is no longer the way I intended it.

  2. Well done. I thought Chart 3 was R for sure, but as I scrolled I had a suspicion based on your previous work. While it's beneficial to be able to write in multiple languages, its' great how much you are able to do with SAS graphing. As always, your work is inspiring.

Leave A Reply

Back to Top