Format your data like a social media superstar


In my industry of data and computer science, precision is typically regarded as a virtue. The more exact that you can be, the better. Many of my colleagues are passionate about the idea, which isn't surprising for a statistical software company.

But in social media, precision is a stigma -- especially when applied to the number of followers or connections that you have. Do you have so few connections that it can be represented as an exact number? Well, I wouldn't go bragging about that!

For example, let's look at this new LinkedIn joiner (blurred to protect his identity). As of this writing he has only -- and exactly -- 15 connections! (For the record, I consider this guy to be a rising star; I'm sure that his connections will skyrocket very soon.)

LinkedIn newcomer
Compare this to a more seasoned networker, shown below. LinkedIn won't tell us how many connections he has; perhaps they are beyond measure. We can see only that he has 500 or more LinkedIn connections. It might be 501 or it might be 500 million. It doesn't matter, according to LinkedIn. All you need to know is that this guy is a highly sought-after connection.

Let's create a SAS format to count connections, LinkedIn-style. It's simple: for cases of 500 or more, the label should be "500+". For fewer than 500, the label will be the embarrassingly precise actual number.

libname library (work); /* "library" is automatically searched for formats */
proc format lib=library;
 value linkedin    
  500 - high ='500+'
  /* all other values fall through to actual number */
/* test data */
data followers;
  length name $ 40 followers 8 displayed 8;
  format displayed linkedin.;
  infile datalines dsd;
  input name followers;
  displayed = followers;
Chris Hemedinger, 776
Kevin Bacon, 100543
Norman Newbie, 3
Colleen Connector, 499

Here's the result. Notice how the "displayed" value acts as sort of an equalizer among the super-connected. Once you've achieved a certain magnitude of connections, you're placed on par with all other "superstars".

Social sites like Instagram and Twitter operate on a different scale. While they still sacrifice precision when displaying a large number of "likes" or "followers", they do at least present a ballpark number that shows the magnitude of the activity. For example, instead of showing the exact follower count for Ronda Rousey (public figure, film star, MMA champ, and daughter of an experienced SAS presenter), Instagram simply truncates the number to the thousands and adds a "K".

To achieve the same in SAS, we can rely on a PICTURE format. This format contains the templates for what a number should look like, plus the multipliers to apply to get to the figure we want to display. Here's an example program, which I repurposed from the BIGMONEY example in SAS documentation.

proc format lib=library;
picture social (round)
 1E03-<1000000='000K' (mult=.001  )
 1E06-<1000000000='000.9M' (mult=.00001)
 1E09-<1000000000000='000.9B' (mult=1E-08)
 1E12-<1000000000000000='000.9T' (mult=1E-11);
/* test data */ 
data likes (keep=actual displayed);
format actual comma20.
 displayed social.;
 do i=1 to 12;
  displayed = actual;

Here's the result:

As you can see, the "displayed" values show you the magnitude of big "likes" in a cool-but-not-so-eagerly-precise way.

Here's my complete test program if you want to try it yourself. Happy networking!


About Author

Chris Hemedinger

Senior Manager, SAS Online Communities

+Chris Hemedinger is the manager of SAS Online Communities. He's also co-author of the popular SAS for Dummies book, author of Custom Tasks for SAS Enterprise Guide using Microsoft .NET, and a frequent participant on the SAS Enterprise Guide discussion forum.


  1. This may be a silly question, but is there a reason you used written-out numbers for the social format instead of like this?

    1E03-<1E06='000K' (mult=.001 )
    1E06-<1E09='000.9M' (mult=.00001)
    1E09-<1E12='000.9B' (mult=1E-08)
    1E12-<1E15='000.9T' (mult=1E-11);

    • Chris Hemedinger
      Chris Hemedinger on


      As I mentioned, I repurposed an example from the SAS documentation for PROC FORMAT. The example used the two different styles. I kept it that way because it shows that you can use scientific notation in SAS syntax -- something that's easy to forget.

  2. This kind of format makes great style for for graph scale values where numbers can have a broad range of values

Leave A Reply

Back to Top