Bringing order to holiday shopping chaos

2

The holidays are sometimes chaotic, especially for those tasked with analyzing consumer shopping data! I would like to share a few tips on adding order to your chaotic data.

SAS recently published an interesting article, sharing the results from a consumer survey. The infographics in the article showed high-level summary information, but there was also a link to the survey results in a more raw/numeric form. And of course I could not help but graph some of those numbers...

Several of the survey questions rated the answers on the same Agree<->Disagree scale, therefore I decided to plot those all together in a single graph. Here are the results, naively plotted using the default alphabetic ordering of all the text values, and using the default colors:

holiday_shopping_survey_blog

I can quickly 'see' all the data together now, but the graph doesn't really help me make sense of the data. By default, SAS picks colors that look okay together, and that are easy to discern. But since the survey allows users to choose answers on a scale from Agree<->Disagree, it would be more useful to assign colors such that Agree is green and Disagree is red. I can accomplish this by assigning colors manually in pattern statements (in the same order as the alphabetic items appear in the legend):

pattern1 v=s c=cxa6d96a; /* agree */
pattern2 v=s c=cxfdae61; /* disagree */
pattern3 v=s c=cxffffbf; /* neutral */
pattern4 v=s c=cx1a9641; /* strongly agree */
pattern5 v=s c=cxd7191c; /* strongly disagree */

holiday_shopping_survey_blog1

The colors are now meaningful, but they're not in a logical order in the legend, or stacked in a meaningful order in the bars. To fix that problem, I assigned numeric values (stack_order) in a data step, and then plotted by the numeric values instead of the text.

if response='Strongly Agree' then stack_order=1;
if response='Agree' then stack_order=2;
if response='Neutral' then stack_order=3;
if response='Disagree' then stack_order=4;
if response='Strongly Disagree' then stack_order=5;

holiday_shopping_survey_blog2

Now the colors in the legend and bars are in a logical order, but there's still a bit of 'non-order' in the graph. The questions/statements are still in their default alphabetic order, which doesn't really benefit us. Therefore I assigned a numeric value to each question, based on how much the users agree/disagree with it (specifically, the value was based on the middle value of the 'neutral' colored segment). I then plotted the graph by these numeric values, rather than the question text:

holiday_shopping_survey_blog3

Now the graph looks very sharp, and logically-ordered ... but the numeric values don't tell us much about the questions and the answers. So I used a little trick called user-defined-formats to make those numeric values show up as the desired text. Here's the code that creates the user-defined format for the legend:

proc sql;
create table foo as select unique stack_order as start, response as label from tran_data;
quit; run;
data control; set foo;
fmtname = 'stackfmt';
type = 'N';
end = START;
run;
proc format lib=work cntlin=control;
run;

Now we have a wonderful graph, where the bar segments (colors) are stacked in a logical order, and the questions themselves are even ordered in a logical way:

holiday_shopping_survey

So, were any of these survey results a 'surprise' to you? What other questions do you think would be interesting/useful to add to the survey?

Share

About Author

Robert Allison

The Graph Guy!

Robert has worked at SAS for over 25 years, and is perhaps the foremost expert in creating custom graphs using SAS/GRAPH. His educational background is in Computer Science, and he holds a BS, MS, and PhD from NC State University. He is the author of several conference papers, has won a few graphic competitions, and has written a book (SAS/GRAPH: Beyond the Basics).

2 Comments

Leave A Reply

Back to Top