Back in 2013, I wrote a paper for the SAS Global Forum, reviewing the attributes that go towards making a good graph. In this paper, I covered many recommendations from industry thought leaders that can help enhance the effectiveness of graphs to deliver the intended information.
The graph on the right shows the number of students in each category in different colleges of a university. This allows comparison of number of students in each category within a college. However, if one wants to compare the number of students in each category across colleges, then this graph does not make the task easy. It is a bit harder to compare the number of "Transition" students in College of Business with Education.
For comparison of students in a category across colleges, it is better to switch the category and group roles and get the graph on the right. Now, it is easier to compare the number of "Transition" students across Colleges.
This example is taken from the paper linked above. The main idea here is that bringing items that are to be compared closer makes the intended task easier. The key is the reduction of the eye movement required to complete the task. The less the eye movement required, the better.
Recently, while browsing the web, I came across another graph example where the delivery of the information could be improved by using the principle of "Proximity", but in a different way.
The graph on the right shows the distribution of a measure by a category "Tracexx". The category values are shown along the x-axis. the boxes are also colored by category, with a scrollable legend on the right.
In my opinion, it is relatively hard to line up and determine the category name for any particular box in the graph. The tick values are quite far away and dense. Also, the color shades are quite close so it is hard to see the category from the legend, which is also far away, with only some values visible. Making these associations requires a lot of eye movement, and it is not easy. For me, it is hard to determine which box represents "Trace11" in the graph, from the axis values or from the legend.
The graph on the right makes a small modification. Click on the graph for a higher resolution image. Placing the category value right next to the box makes it easier to determine which category we are examining. We have effectively moved the x-axis values closer to each box. This reduces the eye movement needed to examine the graph, thus making it easier to interpret. Note, we do not need the legend at all (other than for interactive selection).
To get this result, I saved the lowest value in each category in a variable called "Min". I retained only one value per category, making all others missing. Then, I use a Text plot to display the category value near the minimum value for each category. Note, I also added alternate light vertical bands that allow me to move the category labels a bit away to reduce clutter.
proc sgplot data=boxLabel noautolegend noborder;
vbox value / category=cat nomean nooutliers whiskerpct=0.95;
text x=cat y=min text=cat / rotate=90 position=left
textattrs=(size=6 weight=bold) contributeoffsets=(ymin);
xaxis display=(nolabel novalues noticks) discreteorder=data
yaxis display=(nolabel noline noticks) grid integer
values=(0 to 12 by 2) valueshint;
While this seems like a good solution to me, I would be happy to hear your opinion.
SAS 9.40M2 SGPLOT code: Box_Plot