Prius isn't the highest-mpg hybrid in 2017!

For many years, the Toyota Prius was the hybrid with the best mpg - but in 2017 that's changing! Let's examine the data ...

For analyses like this, I have found the fueleconomy.gov website to be a wonderful source of information. In recent years, they've even made all their data available in a csv file that's easy to download! I wrote some SAS code to import their csv, and then I was off to the races (figuratively) to create some graphs.

In the 2016 data, the Prius Eco was the vehicle sold in the U.S. with the best mpg (53.3 hwy / 57.8 city)

In the 2017 data, the Prius Eco is still up there at the top of the midsize cars graph. And the Prius c is at the top of the compact cars graph.

gas_mpg_2017_midsize_cars

gas_mpg_2017_compact_cars

But there's a new vehicle coming onto the market in 2017 that gets better mpg than all the Prius models ... and surprisingly this vehicle is in the "large cars" category! It's the Hyundai Ioniq Blue, which should start selling sometime early this year (2017). Its EPA ratings are 59.4 mpg highway and 56.5 mpg city!

gas_mpg_2017_large_cars

Will the Ioniq hybrid really get that many mpg when people start driving it? - Time will tell! As you might recall, when the C-Max hybrid first came out in 2012, Ford claimed EPA mpg numbers that beat the Prius, but later had to revise them to lower-than-Prius levels.


And now for a bit of car-related fun ... Two of my co-workers, Mary & Thelma, have a dad who restores and customizes Corvettes. I've seen his work, and he puts out some truly spectacular cars. Here's a picture of an example they loaned me for my blog - it's a 1975 Corvette (customized to be a station wagon, using the Greenwood kit) with a 454 cubic inch 8-cylinder engine. 454 cubic inches is over 7.4 liters - by comparison the Prius gasoline engines are 1.5 & 1.8 liters. They estimate this Corvette gets about 15 mpg ... but who really counts mpg on a hotrod Corvette, eh?!?

corvette_wagon

 

And here's a car my buddy Richard has owned for 30 years. It's a 1973 Plymouth Valiant, with a slant-6 motor and automatic transmission. The engine was converted to the 2 bbl Super Six intake system. The transmission is from a Volare station wagon, and it has a 245 rear differential from a Highway Patrol Dodge Dart. He claims he averaged 37 mpg from NC to IL and back.

valiant_richard

 

So, what's the favorite vehicle you've owned, and what mpg did it get? :)

 

Post a Comment

What to do in Orlando, during SAS Global Forum!

They say "a picture is worth 1000 words" - and I think it might be more like 2000 when it comes to planning out fun/interesting things to do in  a new city! I'm going to the SAS Global Forum (#SASGF) conference in Orlando this year, and I was wondering where the conference hotel was in relation to the fun things to do in the area. So, of course I plotted it all out on a map, using SAS! You might enjoy seeing how I created the map, and you might find the map itself useful if you ever visit Orlando.

First, I compiled a list of things to do in Orlando (using web searches & asking my friends for recommendations). I estimated a latitude/longitude value for each thing, and read them into a SAS dataset using code like this ...

orlando_sign_code

I then created a scatter plot of the locations, and suppressed all the axes (note that the yellow marker is the main conference hotel) ...

orlando_attractions

Next, I wrote SAS code to download the 16 slippy map tiles (16 separate png images) at zoom level 12 for the area of interest, and I annotated the 16 tiles behind the scatter plot.

orlando_attractions

And now for the cool part ... In my maps, I often like to add a little graphic element to let people quickly/easily know what the map represents. In my Las Vegas map, I created my own version of the famous 'Welcome to Las Vegas' sign. The 'Welcome to Orlando' sign isn't quite as fancy, but I thought it would be a nice touch to add to this map. I could have found an image of the sign, and annotated the image ... but where's the fun in that?!? So I decided to create my own version of the sign using simple annotated graphical shapes and text.

First, I created a blue box.

orlando_sign1

Then I added some blue bubbles with slightly smaller lighter-colored bubbles inside them.

orlando_sign2

Then, I overlaid a box using the light color, followed by a slightly smaller blue box.

orlando_sign3

Then a couple more blue bubbles, so fill-in some on the left and right sides.

orlando_sign4

Adding some text almost finished the sign! ... But it still needed a little something to set it off!

orlando_sign5

And that special something is the skyline. I got a skyline image, and modified it in Photoshop Elements, making the background transparent and the buildings the same color as my sign. I then used annotate to add the image seamlessly in the desired location. Wow - did you know SAS could do all that!?!

orlando_sign6

I think the welcome sign really helps spice up an otherwise boring street map! Click the map image below to see the interactive version, where the markers have mouse-over text (and they drill down to a Google search on each attraction).

orlando_attractions

What are your favorite things to do in Orlando? Did I leave out anything?!? Feel free to leave a comment ...

 

Post a Comment

11 new SAS Press titles for 2017

Whether your resolution is to get SAS certified or to become a more advanced SAS programmer, we’ve got you covered with these new titles and upcoming SAS Press books, many of which will be making their bookshelf debut at SAS® Global Forum 2017 in Orlando, FL!

Want to be notified when a new book becomes available? Sign up to receive exclusive discount offers and information about new books delivered right to your inbox.

SAS Press Titles for 2017
1  -  SAS(R) ODS Graphics Designer by Example: A Visual Guide to Creating Graphs Interactively by Sanjay Matange and Jeanette Bottitta illustrates the features of the ODS Graphics Designer. The designer application lets you, the analyst, create graphs interactively so that you can focus on the analysis, and not on learning graph syntax. This book will take you step-by-step through the features of the designer, providing you with examples of graphs that are commonly used for the analysis of data in the health care, life sciences, and finance industries.

new-sas-press-titles-for-2017_02
2  -  Implementing CDISC Using SAS®: An End-to-End Guide, Second Edition by Chris Holland and Jack Shostak updates the first comprehensive book on applying clinical research data and metadata to the 2017 CDISC standards.

new-sas-press-titles-for-2017_03
3  -  An Introduction to SAS® Visual Analytics: How to Explore Numbers, Design Reports, and Gain Insight into Your Data by Tricia Aanderud, Rob Collum, and Ryan Kumpfmiller shows you how you can use SAS® Visual Analytics to transform your complex data into knowledge with meaningful, customized visualizations. This book gives you the ability to access, prepare, and present your data from anywhere and will help anyone learn to make sense of complex data, leading you to smarter, data-driven decisions without writing a single line of code – unless you want to! (1st Quarter 2017)

4  -  SAS® Viya™: The Python Perspective by Kevin D. Smith and Xiangxiang Meng explains how to use Python to drive SAS Viya by directly connecting to the back-end analytics engine: CAS. CAS (Cloud Analytic Services) is a fault-tolerant, high-performance analytic platform that can be installed in many environments (desktop, computing grid, cloud). It is used by various SAS applications, but also has an API accessible from languages such as Java and Python. (1st Quarter 2017)

new-sas-press-titles-for-2017_04
5  -  Business Survival Analysis Using SAS(R): An Introduction to Lifetime Probabilities by Jorge Ribeiro shows professionals with little modeling experience how to apply survival analysis to the world of business. The examples in this book show how to apply models for people with low modelling experience, and present techniques for analysts in a way that avoids high-level theoretical considerations. Graduates of economics, business, and marketing programs, as well as analysts who work in areas such as credit risk, will benefit from this book! (1st Quarter 2017)

Read More »

Post a Comment

American English: Where to use 'yall' versus 'yinz'

If you do much traveling in the United States, you're bound to hear a few words and expressions that are unique to certain areas. Well y'all get ready, because I'm fixin' to analyze some of those words for ya!

I recently found a really neat web application called The Great American Word Mapper that lets you enter words, and see maps of where those words were used most frequently in Twitter posts. It's pretty cool, and almost addicting! Here's an example showing the two words I found most interesting - yall and yinz:

word_mapper

And as with any cool map, I felt compelled to try to create a similar one with SAS software!

Fortunately, they provided a link to their data on a Google drive, which made my endeavor a lot easier. They provided a separate csv file for each letter of the alphabet, and each level of smoothing (none, low, med, and high). Since yall and yinz both start with the same letter, I only needed the 'y' data files, and I decided to go with the 'medium' smoothed data, since those maps looked the best to me. I used the DMS SAS File->Import wizard, which wrote me a bit of Proc Import code that imported the data quickly & easily. I was then able to plot the data fairly easily using Proc Gmap. Here are my SAS maps showing the smoothed data for yall and yinz:

wordmap_yall1

wordmap_yinz1

Here are a few changes (hopefully improvements) in my version of the maps:

  • I added titles/text to explain more about what is represented in the map, and where the data came from.
  • I made the state outlines darker, and left out the county outlines, shifting the focus from counties (which most people aren't familiar with) to states (which most people are familiar with).
  • I leave out the city labels, because they obscure parts of the map and I think the state outlines suffice.
  • I added html mouse-over text to show the state names (click the map image snapshots above, to see the interactive versions with the mouse-over text).

I liked the original maps, but I like my versions even better! The 'yall' map showed just about what I expected - common usage throughout the southeast, with the exception of Florida (where a lot of retirees from up north live). The 'yinz' map showed a high concentration in the western half of Pennsylvania, which is correct (according to Wikipedia and my friend who grew up in that area). But I was a bit curious about a second yinz concentration encompassing several counties located along the border of North Carolina & Virginia. I've never really heard the word yinz used in that area, so I was a bit skeptical. So I decided to dig a little deeper...

Any time smoothing is used, there is a possibility it will distort the true nature of the data. Therefore I decided to plot the unsmoothed yinz data, to see if it might shed some additional light on this odd NC/VA concentration. As I suspected, the raw data map showed that it was really only a couple of counties (Orange and Person) that had the high number of Twitter posts containing the word yinz. So in this particular case, the smoothing exaggerated the NC/VA yinz hotspot quite a bit, and it's probably better to use the unsmoothed data. (Which reinforces my suggestion to always plot your data in several different ways!)

wordmap_yinz_circle


And now for a fun example ... My friend Margie is a bit of a local legend. Her passion is to create clever signs to hold up during sporting events (especially ice hockey) - they are frequently shown on the jumbo screen, and sometimes even on television. She's earned the nickname Clever Sign Chick, which she wears proudly. She spent her early childhood in western Pennsylvania, and therefore she's familiar with their local words such as 'yinz'. So when the Pittsburgh team came down to play our NC team, she greeted them with the following sign (all in good fun, of course!) ... which is an example of a perfect combination of the slang words yinz (commonly used around Pittsburgh), and ain't (commonly used in NC).

yinz_sign

What special slang words are unique to your area? Feel free to share in the comments!

 

Post a Comment

Character to Numeric Conversion in SAS

character-to-numeric-conversion-in-sas_bookHow many of you have been given a SAS data set with variables such as Age, Height, and Weight and some or all of them were stored as character values instead of numeric?  Probably EVERYONE! Yes, we all know how to do the old "swap and drop" (rename and convert), but wouldn't it be nice to perform the conversion in one macro call? You can download the Char_to_Num macro (for free) from my author site, from the book Cody's Collection of Popular Programming Tasks, or from the listing right here in the blog. You call the macro with the name of the original SAS data set that contains one or more variables you want to convert, the name of the SAS data set for the converted variables, and a list of character variables that need converting. Right after the macro listing, I'll show you an example:

Here is a listing of the macro:

*Macro to convert selected character variables to numeric variables;
%macro char_to_num(In_dsn=,   /*Name of the input data set*/                                                                            
                   Out_dsn=,  /*Name of the output data set*/                                                                           
                   Var_list=  /*List of character variables that you                                                                    
                                want to convert from character to                                                                       
                                numeric, separated by spaces*/);                                                                        
   /*Check for null var list */                                                                                                          
   %if &var_list ne %then %do;                                                                                                           
   /*Count the number of variables in the list */                                                                                       
   %let n=%sysfunc(countw(&var_list));                                                                                                  
   data &Out_dsn;                                                                                                                       
      set &In_dsn(rename=(                                                                                                             
      %do i = 1 %to &n;                                                                                                                 
      /* break up list into variable names */                                                                                           
         %let Var = %scan(&Var_list,&i);                                                                                                
      /*Rename each variable name to C_ variable name */                                                                                
         &Var = C_&Var                                                                                                                  
      %end;                                                                                                                             
      ));                                                                                                                               
 
   %do i = 1 %to &n;                                                                                                                   
      %let Var = %scan(&Var_list,&i);                                                                                                   
      &Var = input(C_&Var,best12.);                                                                                                     
   %end;                                                                                                                               
   drop C_:;                                                                                                                           
   run;                                                                                                                                 
  %end;                                                                                                                                 
%mend char_to_num;

As an example, the code below creates a SAS data set (Contains_Chars) followed by a call to the macro:

data Contains_Chars;
   input Name $ Age $ Height $ Weight $;
datalines;
Ron 55 72 180
Jane 57 63 101
;
 
%Char_to_Num(In_Dsn=Contains_Chars, 
             Out_Dsn=Corrected,
             Var_List=Age Height Weight)

The new data set Corrected has the same variable names as the character variables in the Contains_Chars data set except they are now all numeric variables.   Here is a section from PROC CONTENTS:

Character to Numeric Conversion in SAS

I hope this macro will save you some time on your next project.

Post a Comment

Beer: Finding your favorite that you didn't know about!

Data analysis can be used for many things ... how about finding other beers you might like, so you don't keep drinking the same old brand every time? Hang on tight - I think we're about to make a beer run!

I recently read an interesting article on the Flowingdata website, where they graphically charted 100 beer styles. For each style, they drew a rectangle with the width representing the amount of alcohol by volume, and the height representing the bitterness (hoppiness).  They colored the rectangle to try to represent the average color of the beer, and grouped the graphs by family. As you mouse over each of their graphs, it gives you a description of that style, and lists several different brands of beer from that style. Here's a screen-capture of  the graphs for the family of beers in the Pilsner style, for example:

pilsner_original

I found their graphs very interesting, but I also noticed a few things I would have done differently, using SAS graphs. Let me walk you though my changes and enhancements, and see if you like them!

One thing that I found baffling was that they showed an overlay of all the beer style rectangles at the top of the Flowingdata article, but it was purely for artistic purposes. It had no axes or grid lines, and there was no way to tell which rectangle represented which style.

beer_styles_overlay

In my version, I used overlay graphs for their analytic power (rather than artistic power). I created an overlay graph for each style family, so you could see how consistent (or inconsistent) the beers from that family are. For example, here's my overlay graph, followed by the individual style graphs, for the Pilsner style family:

pilsner_sas

In the Flowingdata article, they omit the text & numeric labels along the axes of the individual style graphs, and just show the labels on a single graph at the top of the article. I found that I had to keep scrolling back up to the top to see what the axis values were, and then scrolling back down to the style graph I had been looking at (and hoping that I had correctly remembered the values). By comparison, in my version I fully labeled every graph - this makes them a little more cluttered, but a lot more usable.

The graphs were very small in the Flowingdata article, and therefore the data rectangles were sometimes just a visual 'speck' with more of the black border color than internal yellow/amber beer color. I made my graphs about twice as big, to allow you to see the data better. And on the topic of color - I decided to make all my polygons the same color, to make them easier to compare (I'm not sure that an average color for a particular beer style is very valuable to graph, and I wonder if the colors in the original graphs are actually representative of the beer colors?) Also, the lighter and darker rectangles in the Flowingdata graphs could distort the visual perception of their sizes.

In their article, there was no way to navigate through the style families. You had to scroll up/down, and read all the family names, to find the family you were interested in (and the difficulty was compounded, because the names were not in alphabetical order). In my version, I create a list of all the style families, and let you click the style name to jump directly to those graphs.

style_list

When you hover your mouse over my graphs, you see the description of that style and list of several different brands of beer that are that style (similar to the Flowingdata graphs) ... but you can also click my graphs to launch a Google search for that beer style. The Google search returns some really nice information, and also pictures of the beer (I think the pictures provide much more accurate colors than the colors used in the Flowingdata polygons, if you want to really know what the beer looks like). And for a finishing touch, I add a footnote at the bottom of my graph, giving credit to the data source, and a link to the actual spreadsheet containing the data.

And now, with all this data, how might you use it to find new/different beers, similar to the ones you like? I invite you to tell me - in the comments section!


And what would my blog posts be without some randomly-related pictures from of my friends?!? This time, pictures of beer! ... or should that be 'pitchers' of beer!?! LOL  (Thanks Beth, Paul, and Jason!)...

beer6 beer5 beer4

 

 

 

 

 

 

 

 

beer3

beer2 beer1

 

 

 

 

 

 

 

 

 

 

Post a Comment

SAS Jedi Christmas - SAS 9.4 M4 DS2 Do Loop Upgrade

This SAS Jedi is very excited about the SAS 9.4 M4 release, which brought many wonderful gifts just in time for Christmas. So in the interest of extending the Christmas spirit, I'm going to blog about some of my favorites!

I've long loved the SAS DO statement variant which allows iterating over a discrete list of values:

data test;
   call streaminit(12345);
   do Month='Jan','Feb','Mar';
      Revenue = round(rand('NORMAL',1000,100));
      output;
   end;
   format Revenue dollar8.;
run;


Until now, this has only been available in the traditional SAS DATA step. Neither DS2 nor SAS Macro had this feature. And every time I teach the DS2 class, I've had to say it wasn't available in DS2, but was on my wish list. Well, with the M4 release of SAS, I don't have to wish anymore! :-)

proc ds2;
title 'DS2 Results';
data;
   dcl char Month;
   dcl int Revenue having format dollar8.;
   method init();
      streaminit(12345);
      do Month='Jan','Feb','Mar';
         Revenue = round(rand('NORMAL',1000,100));
         output;
      end;
   end;
enddata;
run;
quit;

jedi20161228_ds2_data_program

So Merry Christmas from SAS! As usual, you can download the code for this episode from HERE.

And may the SAS be with you in the New Year!
Mark

Post a Comment

SAS Temporary Arrays, Not Just for Experts

sas-temporary-arrays-01SAS temporary arrays are an underutilized jewel in the SAS toolbox. I find that many beginning to intermediate SAS programmers are not familiar with temporary arrays. The good news is that there is nothing complicated about them and they are very useful. First of all, what is a temporary array?

Let's start with a "regular" array like this:

array x[10] x1-x10;

This array, called x is associated with the 10 variables x1-x10. (Remember, the array name can be any valid SAS name, it doesn't have to have any relationship to the variables—but it usually does.) Now, how about a temporary array?

array y[10] _temporary_;

You use the keyword _TEMPORARY_ where you usually enter your variable names. There are no variables associated with a temporary array. In this example, you could reference y[1] or y[2], etc. but there are no variables y1, y2, etc.

My favorite use of a temporary array is for table lookup. You can load the elements of a temporary array and then use those elements to hold values. Here is an example:

You have sales goals for years 1-10 and the sales figures for all of your salespeople for these 10 years. You want to compare each person's sales with the goals for the 10 years. By placing the 10 sales goals in a temporary array, you can retrieve the goal amount by knowing the year. The temporary array elements are automatically retained and stored in memory, so they are a perfect for tasks such as this. Take a look at the following program to see how this works:

*Program to demonstrate temporary arrays;
data Revenue;
   array Goal[10] _temporary_;
   *Load the temporary array with values;
   if _n_ = 1 then do Year = 1 to 10;
      input Goal[Year] @;
   end;
   *Now input the sales data;
   input Name $ Year Sales;
   Difference = Sales – Goal[Year];
datalines;
10 11 14 15 18 20 23 28 30 33
Fred 3 16
Joan 1 11
Helen 8 45
;
title "Listing of Data Set Revenue";
proc print data=Revenue noobs;
run;

Here is the output:

sas-temporary-arrays
You might want to define an array such as Goal like this:

array Goal[2001:2010] _temporary_;

where the subscript can be the actual value for a year. Notice the colon between the two years. This causes the subscripts for this array range from 2001 to 2010. Also, because arrays can be multi-dimensional, you can perform multi-way lookups. Give it a try.

You can learn more about temporary arrays from my book, Learning SAS by Example: A Programmer's Guide, available from SAS Press.

Post a Comment

5 Benefits of writing a book with SAS Press

Editor's note: This series of blogs addresses the questions we are most frequently asked at SAS Press!

Do you have a great book idea already in mind, or think you might want to write about an analytics or industry topic? Consider writing a SAS Press book or a SAS & Wiley Business Series book and join our growing community of SAS Authors!

There are a number of benefits to publishing with SAS Press.

1.      Access to free SAS software and technical expertise

Any SAS® or JMP® software that you need to work on the book will be provided free of cost to you! And, all of our manuscripts are vetted by SAS programming experts and developers, working with you, the author, to ensure that the best programming techniques and latest software are being used.

2.      Unparalleled access to SAS users for promotion and sales opportunities

Our expert marketing team connects with SAS and JMP users through multiple social media channels, a variety of analytic and statistical conferences, email subscriptions, trade shows, SAS training courses and so much more!

3.      Professional development

You will be assigned a professional developmental editor and a design team who will help guide you through the writing process, and they will be there to answer any questions that you have.

5-benefits-of-writing-a-book-with-sas-press

SAS Press Developmental Editor Brenna Leath and SAS Author Richard Zink celebrate his latest book

4.      Thorough copyediting by SAS technical editors

Our trained technical editors are experts in SAS and JMP. They will make sure that your finished manuscript is as polished and technically correct as possible.

5.      Get the SAS stamp of approval

Becoming a successful SAS Press author demonstrates a level of expertise and a contribution to the SAS user community that sets you apart. Via our global booksellers, your book can reach a worldwide audience in print and e-book formats!

Still not convinced?

Hear it from the authors themselves in this short video. Visit support.sas.com/publish.

 

Too Busy to Write? Review Instead!

If you have technical and teaching abilities but are too busy to write a book, consider reviewing instead. We are always looking for qualified technical reviewers to help with our book development process! Reviewers receive a copy of the book when it’s published, book credit to be used in the SAS Store, and much gratitude from SAS and SAS Press authors for your help!

Post a Comment

Graphs: Comparing R, Excel, Tableau, SPSS, Matlab, JS, Python, and SAS

Are you a visualization & graphing expert? Can you identify which tool (R, Excel, Tableau, SPSS, Matlab, JS, Python, or SAS) was used to create each of these graphs? No cheating!

I recently read Tim Matteson's blog where he presented 18 graphs, and had his readers try to guess which software was used to create each of them. I thought it was an interesting exercise, but I was a little disappointed in the graphs. My buddy Paul Kent said I should create my own new/improved version of each graph, and I thought that sounded like a splendid idea! Be sure to click the link above to see the original versions, so you can better appreciate the improvements.

Can you determine which software I used to create each of my improved versions? (leave your guesses in the comments section)

Chart 1

The biggest problem in the original graph, was that the colors and order of the bar segments didn't make sense - seems like they should be bad-to-good, but the original graph had them in alphabetical order. Also, the Xnn labels along the left-side axis were cluttered and difficult to read. In my version I spaced the labels out more, and also left-aligned them so the 'X's lined up and made them easier to read.

likert

Chart 2

In the original chart, having a colored area behind the questions made it look (at first glance) like those were bars, therefore I didn't color that area in my graph. I was a bit confused by the numbers to the left and right of the bars in the original, therefore in my version I color-coded these numbers so the user would know at-a-glance that the left number represented 'disagree' and the right number represented 'agree'. In survey data like this, I think it's important to be able to see whether over 50% of the respondents agree or disagree, so I added a reference line at 50%.

book_survey

Chart 3

In the original chart, they had the axis labels along both the left and bottom, showing each label twice. In my plot, I placed the label along the diagonal boxes, allowing me to only show each label once (and also eliminating the sideways labels along the left axis). I used transparent plot markers, so you can see where markers are stacking. I also use a different color marker from the axes and text, so the markers stand out more.

crime_rate

Chart 4

The original chart used so many grid lines that I found it difficult to follow a line to the axis. I used years rather than months along the x-axis, because that seemed easier to understand for such a long time period (quick - how many years is 70 months!?! see what I mean!)

recession_job_losses

Chart 5

For this one, I left it pretty much as-is, except I placed the labels inside the longer bars (rather than outside), thereby making more room for the bars. I also explain what 'cola' is in the title, since it's an acronym most people probably aren't familiar with - wouldn't want people thinking this was a graph about soft drinks!

cola

Chart 6

For this chart, I didn't have the original data, so I decided to go with some data that was similar, but less dense. I'm not sure what the original chart was trying to show, but I can't imagine it was doing a very good job of it (looked like a cluttered mess of points & lines to me).

points_lines_3d

Chart 7

In the original chart, I don't think the circles showed up very well against the black background - therefore I didn't put any circles on my version (if you want to see a black map with circles, have a look at my map with animated circles). Be sure to click on this one, to see the full size map (to get the full effect)!

earth_at_night

Chart 8

The original chart was a simple scatter, with '+' markers, and dark grid lines. In my version, I used transparent round markers - this way you can see when multiple markers are stacked in the same location. I also use light grid lines, so the grid doesn't compete with the markers for your attention. I also added some summary statistics in the top/left corner of the graph.

scatter

Chart 9

I'm not a big fan of using black backgrounds in a graph ... but if you're going to create any kind of graph, at least show the scales along the sides!

curves

Chart 10

This is another one I didn't have the exact data for, so I used some similar data. The biggest change I made was using transparent markers so you can see where multiple markers are stacked on top of each other. I also use a grid of reference lines from both axes, rather than just one axis.

random_scatter

Chart 11

Although the original chart didn't have any labeling, I suspect it was some of Fisher's classic iris data set, therefore I used some of that data in my chart. The first improvement I made was labeling the graph, so you quickly know what I'm plotting. I also annotate a picture of a labeled iris flower, so you know what a petal and a sepal is.

iris_flower

Chart 12

I'm not a big fan of using 3d bars on a 3d map to show data, like they did in the original graph - the taller/front bars inevitably obscure some of the shorter/back bars, etc. Therefore in my graph I show how to plot data as markers on a 2d street map.

west_nile_chicago

Chart 13

In the original chart, I'm not sure exactly which year(s) of earthquake data they use, since there is no title or label. In my chart, I show all the major earthquakes for a 40+ year time period, and I also center my map on the Pacific ocean (so it better shows the 'ring of fire'). I also use circles rather than filled dots, so it's easier to see almost-overlapping markers.

worldquakes_recent

Chart 14

In charts like this, I really don't like when people use a diverging color scheme (gradient shades of 2 colors, meeting in the middle) - those should be used when the scale goes from bad-to-good, etc. In this case, where the colors represent a simple "Percent of Trials" gradient shades of a single color should be used. They left-justified their Cancer Conditions, which placed them far from the chart, and made it difficult to see which colored blocks went with which label - I right-justified them. Also, it was difficult to determine whether white boxes were light gradients, or no-data. In my chart, I use a hatched pattern for no-data, to make the distinction more obvious.

And in the bottom (bar) chart portion, I was a bit confused by the numbers on top of the bars - after a bit of scrutinizing the graph, I found that the numbers represent the difference in the Actual and Expected time. Therefore I tried to make that more obvious in my bar chart.

cancer_intervention_cap

Chart 15

I don't really have access to any software to do solid-modeling, so instead of doing an animation of a solid-model of the earth (which looked pretty pitiful in the original blog), I am using a different animation. Click the image below to see it animated:

gapminder_cap

Chart 16

For this chart, my version is a little cleaner, and I've moved a few of the labels to new locations.

shoe_sales

Chart 17

The original chart had somewhat willy-nilly axis tick marks, and I wasn't real keen on using circles in the legend to coincide with the lines in the graph. I didn't have this exact data, therefore I chose some similar time-series data that I could show three lines overlaid. Notice that in addition to the color legend, I also added a label to the end of each line.

population_graph

Chart 18

For this one, I used slightly different colors, and slightly larger/bolder text, but aside from that it was already a great graph. :-)

catalyst_3d_surface

 

Ok - time to enter your guesses in the comments section! Which software(s) were used to create which graphs?

After making your guesses, you can scroll down to find the answer! ...

 

 

 

 

 

 

 

keep scrolling ...

 

 

 

 

 

 

 

queue dramatic music ...

 

 

 

 

 

 

Chart 1 - SAS

Chart 2 - SAS

Chart 3 - SAS

Chart 4 - SAS

Chart 5 - SAS

Chart 6 - SAS

Chart 7 - SAS

Chart 8 - SAS

Chart 9 - SAS

Chart 10 - SAS

Chart 11 - SAS

Chart 12 - SAS

Chart 13 - SAS

Chart 14 - SAS

Chart 15 - SAS

Chart 16 - SAS

Chart 17 - SAS

Chart 18 - SAS

Yep, I used SAS to create all 18 of these charts!  And if you'd like to see the SAS code, I've set up an examples page.

 

Post a Comment