Are you a fan of Hans Rosling's famous bubble plots? ... Then why not learn how to create your own bubble plots in SAS University Edition?!? :)
Perhaps you saw my SAS/GRAPH imitation of Hans Rosling's animation in a previous blog (see a snapshot of my graph below)? Or perhaps the SGPLOT version in Sanjay Matange's blog? Or maybe you're just a fan of bubble plots in general? Whatever the case, this blog will show you the basics of creating bubble plots in your free copy of SAS University Edition that you recently downloaded!
First, you'll need to have some data that makes sense to visualize with a bubble plot. You'll typically be representing 3 or 4 values with each bubble. Your X and Y variables will be represented by the position of the marker (like a regular scatter plot), and the size of the marker will represent the value of a 3rd variable. And you'll sometimes want to use a 4th variable to control the color of the bubbles.
Perhaps you already have the 'perfect' data for a bubble chart, but you'll often need to summarize your data first. There are several ways to do that in SAS - I'll show you the SQL way, since many of you are probably already familiar with SQL. Enter the following into the CODE tab of the Program 1 window, to summarize the data from the SASHELP.CARS data set (which ships will SAS). You can type the code by hand, or copy-n-paste it. Then click the Run button (icon of a little man running). Look at the log messages to make sure it ran correctly.
proc sql; create table car_summary as select unique origin, make, avg(horsepower) as hp, avg(mpg_city) as city, avg(mpg_highway) as highway from sashelp.cars where type='Truck' group by origin, make; quit; run; proc print data=car_summary; run;
If you entered & ran all the code correctly, the Proc Print should produce the following summarized table:
Now enter & run the following code (in the CODE tab again) to create the bubble plot. The X/Y position of the bubbles will be determined by the Highway and City MPG, the size of the bubbles will represent the Horsepower, and the color will represent the country of Origin. If you're typing the code by hand, make sure to include all the quotes, slashes, and semicolons - they are important!
Title "Truck MPG and Horsepower Comparison"; proc sgplot data=car_summary; bubble x=highway y=city size=hp / group=origin datalabel=make; keylegend / location=inside position=bottomright; run;
And if you did everything just right, you should get a bubble plot that looks a lot like this ... and you're well on your way to becoming a SAS visualization expert! :)
Now that you're a bubble plot expert, what data would you like to use in your own bubble plot? (feel free to add your reply/answer in a comment)