Spark and Summary Plots

4

In the area of graphical visualization of data, Edward Tufte is a thought leader and has put forth many innovative ideas that enhance the understanding of the information in the graph with minimal distractions and potential for misinterpretation.

One of his ideas has been the use of "Spark" plots.  As per my understanding, these are very light weight graphs that can depict the key information in a very small space.  Often such graphs can be included inline with other textual information in a paragraph like this:  spark_3.  In this case, I have generated this graph using SGPLOT procedure with minimal decorations to depict the trend of the stock prices for Intel from the sashelp.stocks data set.  I display only the series, last value and a label.

SGPLOT code for Spark Plot;

proc sgplot data=spark noautolegend noborder nowall;
  series x=date y=adjclose;
  scatter x=date y=lastvalue / markerattrs=(color=blue symbol=circlefilled size=12);
  text x=date y=lastvalue text=lastvalue / position=topright textattrs=(size=20);
  text x=date y=firstvalue text=label / position=left textattrs=(size=20)
         splitpolicy=splitalways splitchar='.';
  xaxis display=none;
  yaxis display=none offsetmin=0 offsetmax=0;
run;

Recently, I received a request from SAS user Benjamin Knisley to create a similar lightweight "Graphical Summary" for visualizing patient data over time.  The graph shown below includes display of the visits and hospitalization over time.  Multiple visits are depicted as dots for easy viewing and the x and y axes are removed.  Some significant information about the patient, clinic and actual start and end dates is added.  See link below for full code.  I believe this depiction of the data is also motivated by Tufte's ideas.

visits_dot_4

One customization needed in the above graph is the use of the VALUES option since user wanted a sparse display of the years on the x-axis.  This too can be generalized by using GTL which provides the INTERVAL and INTERVALMULTIPLIER options on the TIMEOPTS bundle.

SGPLOT code for Graphical Summary Graph:

title j=l 'Family name, Given name' j=r 'County Clinic';
proc sgplot data=dots noautolegend noborder nowall;
  scatter x=date y=y / markerattrs=(symbol=circlefilled size=5);
  xaxistable hospitalized / x=date nomissingchar labelattrs=(size=9 weight=bold)
                     valueattrs=(size=10 weight=bold);
  text x=date y=ylbl text=firstdate / position=right contributeoffsets=none;
  text x=date y=ylbl text=lastdate / position=left contributeoffsets=none;
  xaxis type=time values=('01jan1980'd '01jan1985'd '01jan1990'd '01jan1995'd)
           valueshint display=(nolabel) valuesformat=year. valueattrs=(size=9 weight=bold);
  yaxis display=(noline noticks novalues) labelattrs=(size=9 weight=bold);
run;

Full SAS 9.4 code: graphicalsummary  

Share

About Author

Sanjay Matange

Director, R&D

Sanjay Matange is R&D Director in the Data Visualization Division responsible for the development and support of the ODS Graphics system, including the Graph Template Language (GTL), Statistical Graphics (SG) procedures, ODS Graphics Designer and related software. Sanjay has co-authored a book on SG Procedures with SAS/PRESS.

Related Posts

4 Comments

  1. Great example! I never would have thought of this as a spark graph, but now that you've pointed it out I think it's a new category...a very lightweight, small graph that can be easily repeated for different cases, very visually obvious, and high content to area.

    Definitely a concept I hope to steal in the future!

    Tom

    • Sanjay Matange

      Please feel free to take the code and ideas. That is the purpose of the blog. I hope you will post any innovations you make.

  2. Rick Wicklin

    I think the seond graph falls into the "small multiples" category. Although there is too much information to inline the graph, it is perfect for placing at the top/bottom of a medical report or for using in a panel of similar plots to visually compare a cohort of patients in a study.

Back to Top