Many areas of the US are experiencing record low unemployment. This is great at the national level, and also great at a personal level (for example, I now have fewer unemployed friends asking to borrow money!) But just how low is the US unemployment rate, and how does it compare with the historical data? This seemed like a great challenge for a GraphGuy like me!
After much thought, I decided to use a band plot, so I could show the standard unemployment rate, and also three other unemployment rates that include various extra categories of unemployed. My plot is somewhat based on one I saw in a mercatus.org article, but with several changes and improvements. Here's my final graph (read on below, if you'd like to learn more details about how I created it).
The US Bureau of Labor Statistics has a web page where you can download unemployment data. I chose to download not only the data for the standard/official unemployment rate, but also three other variations that include a few additional categories of unemployed. Here is a screen capture of my data selection in the BLS interface:
I selected the desired years (2007 and onward), and then downloaded the data. Their interface downloads each of the four series into a separate spreadsheet. Here's the code I used to import and transpose one of the spreadsheets (the code for the others is very similar):
PROC IMPORT DATAFILE="SeriesReport-20190129125513_9c8d7c.xlsx" OUT=u3 DBMS=XLSX REPLACE;
RANGE="BLS Data Series$A12:M24";
proc transpose data=u3 out=u3 (rename=(col1=u3_unemployment _name_=month) drop=_label_);
After importing all four spreadsheets, I merged them into a single dataset using the following data step:
data my_data; merge u3 u4 u5 u6;
BLS Basic Graph
The BLS interface lets you graph the data for each series separately. Here's what one of their graphs looks like. They're decent simple graphs (aside from the crowded xaxis, which is a bit difficult to read), but I was more interested in a combined plot showing all 4 series together.
My Band Plot Graph
Since each of the four series of unemployment consists of the previous series, plus some extra unemployed, I decided to use a band plot, where the bottom band is the standard unemployment rate, and then each band stacked on top of it shows the additional unemployed workers that the next series adds.
In SAS' Proc SGplot, you create a band by specifying a lower and upper value for each band, at each point along the xaxis. This required a little manipulation of the data, which was easy to accomplish in a data step:
data my_data; set my_data;
I can now specify the bands in Proc SGplot using the following:
band x=date lower=band4_min upper=band4_max / fillattrs=(color=&lred);
band x=date lower=band3_min upper=band3_max / fillattrs=(color=&lorange);
band x=date lower=band2_min upper=band2_max / fillattrs=(color=&lgreen);
band x=date lower=band1_min upper=band1_max / fillattrs=(color=&lblue);
I liked the band plot, but some of the colors (such as blue and green) tended to visually blend in together. Therefore, I added a line at the top of each band, using a series statement. These lines are a darker shade of the fill color:
series x=date y=band4_max / lineattrs=(color=&dred);
series x=date y=band3_max / lineattrs=(color=&dorange);
series x=date y=band2_max / lineattrs=(color=&dgreen);
series x=date y=band1_max / lineattrs=(color=&dblue);
I was now happy with the graphical part of the graph, but I also needed to finish the explaining part. What do each of the colors represent? I would normally use a color legend, but in this case the explanations are a bit long/wordy, therefore I wanted to try something a little different. I decided to use annotate, and draw a line from the top edge of each color, and attach it to a text box that explains what the color represents. I used SGplot's pad option to add some white-space to the right of the graph, to make room for these annotated text boxes.
proc sgplot data=my_data noautolegend noborder pad=(right=21pct) sganno=my_anno;
And that's how I got the final graph!
Now it's your turn - what other changes/improvements would you make to this graph? Are there other (completely different) ways you would recommend plotting this data? Feel free to discuss in the comments section!