Can the selection of the axis range in a graph influence how you perceive the data? Let's find out with a "Labor Participation Rate" graph ...
Medical doctors have traditionally taken the Hippocratic Oath, swearing to practice medicine honestly. I have often thought that people creating graphs should swear a similar oath, or at least strive to "do no evil." Which leads to the topic of this blog - what I call (tongue-in-cheek) "the axis of evil."
After my recent blog about the unemployment rate, I decided to look into some additional statistics that might help provide a more complete picture of employment in the U.S. I found an interesting article that showed a graph of the Labor Participation Rate since 1960. The graph definitely showed a climb during the 1970s and 80s, and then a drop after the recent recessions.
Here's my SAS version of this graph (pretty good imitation, eh?!?):
I was thinking, "Wow, this is a great graph!" But then I started looking more closely, and the y-axis started bothering me... Why had they chosen to start the y-axis at 50%? That's not where the data starts, and I can't think of a good reason why "50%" is important to include in the axis. Therefore it seems like it was an arbitrary decision - at best, for aesthetics ... and at worst, to try to try to squish the line so that the changes don't look as big as they really are.
Therefore I created a 2nd version of the graph, and let the y-axis auto-scale. This way the graph shows the data spread out to its maximum extents, so you can best see any changes in the data.
And then I got to thinking ... the range of possible values for the labor participation rate are 0% to 100%, therefore why not use those values for the y-axis scale? This will show what the data has done "in the grand scheme of things." Hmm - when plotted like that, the changes are more of a smooth speed-bump, rather than a mountain & cliff.
I always recommend looking at data in several different ways, to get a more complete picture. And using SAS software makes that simple to do!
Which of the 3 versions of the graph do you like best, and which one do you think best shows what's important about the data?