The flu season has started here in the U.S., and according to the Centers for Disease Control and Prevention (CDC) data it has caused 214 deaths in the first week of 2020. Is this number higher, or lower, than usual? When does the flu season start, and how long does it last? Sounds like a fine excuse to create some graphs - follow along, and I'll show you how!
The CDC has a flu page, and towards the bottom there's a "Pneumonia and Influenza (P&I) Mortality Surveillance" graph (the black & red line graph). And under that graph, there's a link to "View Chart Data." This link points to the latest weekly data in a CSV file. Here's what the CSV file looks like, when viewed in Excel (I've circled the 3 variables I'm interested in):
I used the following code to import the CSV into SAS:
filename csv_file "NCHSData02.csv";
infile csv_file lrecl=200 dlm=',' pad firstobs=2;
format flu_deaths comma8.0;
input year week pct_deaths_due_to_pneu_and_flu expected
threshold all_deaths pneumonia_deaths flu_deaths;
if year>=2010 then output;
My first attempt at graphing the data actually produced a 'bad graph' - I'm sharing it with you, so you will be able to recognize the problem if it ever happens to you. And I'll also show you how to fix it! I vaguely knew I wanted to see the data as a time series, showing the number of deaths per week, therefore I naively started with the following code:
proc sgplot data=my_data;
spline x=week y=flu_deaths;
In the graph above, you'll notice that the end (week 52) of one year 'loops' back around and is connected to the beginning of the next year (week 1). This makes the graph pretty much unusable and worthless. How do I get it to draw a separate line for each year? ... One way is to add a group=year to the code - that draws a separate line for each year, and also gives each line a different color.
proc sgplot data=my_data;
spline x=week y=flu_deaths / group=year curvelabel curvelabelpos=start;
yaxis labelpos=top values=(0 to 1750 by 250) offsetmin=0 offsetmax=0;
An Even Better Graph
The simple line graph was ~OK for viewing the data ... and it did provide an easy way of comparing the number of deaths during the same time periods each year. And I could see when the flu seasons generally started and stopped. But the graph just didn't click with my brain. Rather than seeing all the years overlaid, I wanted to see more of a continuous plot over time. But my data only has year and week variables ... it doesn't have a continuous 'date' variable. I guess I could estimate a date values for each year/week combination, but there's an easier way to get the graph I wanted.
I can create a separate plot for each year (showing the number of deaths per week), and then place all those graphs side-by-side using Proc SGpanel. Here's what I came up with.
proc sgpanel data=my_data noautolegend;
panelby year / onepanel columns=8 novarname
headerattrs=(size=12pt color=gray33) noborder;
band x=week lower=0 upper=flu_deaths / fill fillattrs=(color=red);
rowaxis labelpos=top values=(0 to 1750 by 250)
colaxis values=(1 to 52 by 1) display=(nolabel noticks novalues)
refline 52 / axis=x lineattrs=(color=graycc thickness=1px);
refline 0 to 1750 by 250 / axis=y lineattrs=(color=graycc thickness=1px);
Now I've got my graph code ready for the 2020 flu season. All I have to do is occasionally download the latest data, and re-run my code, and it will add those 2020 values to the graph. And it's a might-fine graph, if I do say so myself! (If you'd like to experiment with the SAS code, here's the complete SAS program.)
Here are a few questions I invite you to discuss in the comments section:
- Do you think the 2020 flu season will be better (fewer deaths), or worse (more deaths), than last year?
- What are some other ways we could visualize this data?
- And, just for fun, do you have any old family traditions for treating the flu?
A lot of my friends like treating the flu with chicken soup. While I'm not sure it can actually cure the flu, it probably doesn't make it any worse ... and it sure does taste good! Here's a picture of some chicken soup made by my friend Celia. She's the "godmother" of roller derby around Raleigh, and her chicken soup (or rather "chiggen soop") is rather prolific, according to her roller derby minions.