How to handle percent (%) values in SAS

26

Being that 2013 is the International Year of Statistics, I wanted to make sure everyone knows how to handle my favorite statistic - percent (%) - in SAS!

I often see data in spreadsheets/csv/etc that purport to represent "percents"... but you have to be a bit careful when working with those values.  Are they the raw numeric values, or are they the formatted values?

In SAS, to represent 12.5%, you'll want to store the numeric value .125, and then apply the 'percent.' format so that it will print out (in tables, graphs, etc) as 12.5%.

Something like this...

data foo;
format y percent7.1;
y=.125;
run;

proc print data=foo;
run;

That was simple enough!... but what about a graph where you want to show the value rounded to a whole percent on the axis, but show more precision (let's say 2 decimal places) in the pointlabels for the markers in the plot???

Yes - you can do that in SAS!

I recommend using the simple format statement to control the graph's axis, and then create a special extra/temporary text variable containing the marker values, formatted to show a percent with 2 decimal places, to use as the pointlabels.

data my_data;
set my_data;
length custom_text $50;
custom_text=put(percent_value,percent8.2);
run;

symbol1 color=red value=circle height=6 pointlabel=(height=11pt color=blue "#custom_text");

proc gplot data=my_data;
format percent_value percent7.0;
plot percent_value*letter;
run;

Now the axis shows nice/rounded values (with no decimal places), and the pointlabels show 2 decimal places:)

 

 

Hopefully this simple example will help teach you the basics, so you can customize your graphs in endless ways!

Here is the complete SAS code for the above example.

 

Share

About Author

Robert Allison

The Graph Guy!

Robert has worked at SAS for over a quarter century, and his specialty is customizing graphs and maps - adding those little extra touches that help them answer your questions at a glance. His educational background is in Computer Science, and he holds a BS, MS, and PhD from NC State University.

Related Posts

26 Comments

  1. Hi, i have the % format working ok, however, I'm only interested in the output when it is greater than 50%,

    when i write it as either >50 or as >50% it doesn't give me the expected results

    • Dan Heath

      What you'll want to do in that cases is modify the data step to copy the percent value into custom text only when the percent value is greater than 0.50.

  2. Hello, is there any possibility to show the outputs with commas instead of dots? like 8,2% instead of 8.2%
    thanks

    • Robert Allison

      You can set your 'locale' to an area that prefers the comma instead of dots (such as German_Germany), and then use the nlpct format. Here is an example:

      options LOCALE=German_Germany;

      data foo;
      format y nlpct7.1;
      y=.125;
      run;

      proc print data=foo;
      run;

  3. BANDJA TCHOUNKE Etienne Claude on

    Hi all !

    I'm a new new beginer in SAS Enterprise Guide and I never known SAS. I ever use MS Excel. I'm in Central Africa, a french speaking man, and I'm looking for help about SAS EG.
    How to obtain the same results with SAS EG ?

    Thanks a lot

  4. Hi Robertson, very usefull tips!
    Actually I'm working with a table with very low percent values so I formatted with lots of decimals, like this:
    1.6768966%
    .43991909%
    The thing is that with those bunch of numbers I need to force a 0 (zero) at integer part for numbers lower than 0.9 so they don't confuse "visually", to look like this:
    1.6768966%
    0.43991909%
    How can I do that?
    Thanks in advance!
    Ed.

    • Robert Allison

      When you specify your format, just make sure to specify a length long enough to hold all the digits (when you decide on your total length, be sure to count the decimal, the % character, and the possibility of a minus sign -- I usually make it a little longer than I think it might need to be).

      data foo;
      x=.016768966; output;
      x=.0043991909; output;
      run;

      proc print data=foo;
      format x percent15.10;
      run;

      • If I wanted to show 0.2125 as 21.25%, "percent6.2" should work but it doesn't. It only works when I use percent8.2 instead. Do you know why that could be so? "21.25%" only has six characters.

        • Robert Allison

          I believe this is because the percent format allocates characters for the possibility of a negative value. For example, -0.2125, expressed as a % using the percent8.2 format would be (21.25%).

          data foo;
          format a percent8.2;
          a=-0.2125;
          run;
          proc print data=foo; run;

  5. Pingback: How to handle time values in SAS | SAS Training

  6. Hi I have created two data set initially and I need to solve second part of my project from school
    this is project I do have

    Information of yearend stock value for 20 company during year 2000-2009 stored in a text file called 'annual_stock_price.txt'

    stock1 ='Microsoft'
    stock2 ='Apple Inc'
    stock3 ='Walmart'
    stock4 ='BJ'
    stock5 ='Costco'
    stock6 ='3M'
    stock7 ='Eli Lilly'
    stock8 ='Pfizer'
    stock9 ='Exelon'
    stock10='Toyota'

    stock11='Ford'
    stock12='GM'
    stock13='American Airline'
    stock14='Johnson & Johnson'
    stock15='Exxon'
    stock16='Georgia Pacific'
    stock17='American Standard'
    stock18='Northrop Gruman'
    stock19='Sears'
    stock20='Peco'

    *task1 - calculate the percentage return of each stock,
    using equation return=(stock value of this year-stock value of last year)/stock value of last year*100
    and average return for that stock from year 2001-2009.

    *task2 - calculate the grand average percentage return for all the stocks from year 2001-2009

    *task3 - Read in another file called 'annual_dividends.txt' containing dividends during year 2001-2009. Find out highest_dividend, lowest_dividend for each company.

    *task4 - generate report that listing stock name, percentage return from year 2001-2009, average return for that stock, comment, highest_dividend, lowest_dividend. The logic of defining variable 'comment' is as follows:
    if average return of a stock is smaller than the grand average percentage return then, comment='poor performance'
    else comment='good performance'

    and second set is Information of annual dividend for 20 companies during year 2001-2009 stored in a text file called 'annual_dividend.txt'

    stock1 ='Microsoft'
    stock2 ='Apple Inc'
    stock3 ='Walmart'
    stock4 ='BJ'
    stock5 ='Costco'
    stock6 ='3M'
    stock7 ='Eli Lilly'
    stock8 ='Pfizer'
    stock9 ='Exelon'
    stock10='Toyota'

    stock11='Ford'
    stock12='GM'
    stock13='American Airline'
    stock14='Johnson & Johnson'
    stock15='Exxon'
    stock16='Georgia Pacific'
    stock17='American Standard'
    stock18='Northrop Gruman'
    stock19='Sears'
    stock20='Peco'

    *task1 - get the highest dividend value and lowest dividend value for each company from year 2001-2009
    *task2 - and combine it to the report with percentage return, so that the report has the following items :
    company name
    percentage return from year 2001-2009
    average return for that company
    comment
    highest dividend value
    lowest dividend value
    Generate the report in RTF format and save to your PC.

    and there is two text files first file is annual_dividend.txt

    2001 1.60 2.28 2.12 1.68 1.53 1.12 0.67 1.52 1.18 1.18 1.10 1.14 1.33 1.76 1.74 2.30 0.93 0.84 0.92 2.00
    2002 1.65 2.50 2.16 1.68 1.63 1.16 0.69 1.60 1.24 1.26 1.14 1.20 1.43 1.84 1.86 2.38 0.97 0.89 1.00 2.16
    2003 1.70 2.63 2.16 1.68 1.73 1.21 0.75 1.68 1.30 1.36 1.18 1.28 1.55 1.92 1.98 2.46 1.07 0.97 1.08 2.32
    2004 1.75 2.70 2.16 1.68 1.83 1.27 0.80 1.74 1.37 1.46 1.22 1.38 1.69 2.00 2.09 2.56 1.16 1.08 1.16 2.48
    2005 1.80 2.86 2.16 1.68 1.91 1.32 0.90 1.78 1.45 1.56 1.26 1.52 1.83 2.08 2.16 2.57 1.19 1.18 1.24 2.64
    2006 1.86 2.94 2.18 1.68 1.99 1.37 1.00 1.83 1.52 1.63 1.29 1.66 1.96 2.18 2.20 2.66 1.29 1.30 1.32 2.77
    2007 2.12 3.02 2.23 1.68 2.07 1.44 1.15 1.89 1.59 1.70 1.34 1.72 2.07 2.28 2.24 2.74 1.44 1.38 1.40 2.86
    2008 2.60 3.10 2.30 1.68 2.15 1.52 1.35 1.95 1.63 1.75 1.40 1.78 2.17 2.38 2.28 2.84 1.57 1.46 1.50 2.91
    2009 2.82 3.16 2.40 1.76 2.23 1.60 1.64 2.00 1.67 1.79 1.46 1.86 2.27 2.48 2.32 2.95 1.66 1.52 1.60 2.96

    second Anuual_stock price.txt file

    2000 30.50 62.01 62.07 89.62 12.43 41.04 20.58 51.44 81.10 31.08 10.06 71.06 41.25 21.68 51.64 32.21 30.87 40.79 60.84 111.85
    2001 35.16 63.32 72.12 83.68 14.53 51.12 24.67 51.52 81.18 31.18 21.10 71.14 41.33 21.76 55.74 52.30 30.93 50.84 55.92 122.00
    2002 41.45 67.50 90.22 78.66 19.63 61.16 26.69 61.60 81.24 41.26 21.14 70.20 51.43 31.84 56.86 42.38 30.97 40.89 51.00 132.16
    2003 60.75 64.48 102.16 66.68 21.73 61.21 30.75 65.68 91.30 51.36 31.18 70.28 51.55 41.92 41.98 62.46 31.07 55.97 51.08 142.32
    2004 42.17 69.23 92.16 70.68 23.83 61.27 40.80 76.74 91.37 61.46 41.22 72.38 61.69 42.00 32.09 72.56 41.16 61.08 51.16 152.48
    2005 56.11 72.34 82.16 72.78 25.91 71.32 30.90 66.78 93.45 71.56 51.26 73.52 61.83 52.08 42.16 82.57 51.19 71.18 41.24 122.64
    2006 63.13 79.45 79.18 74.36 31.99 91.37 31.00 61.83 99.52 71.63 65.29 77.66 71.96 62.18 52.20 92.66 61.29 41.30 51.32 122.77
    2007 65.23 83.23 92.23 70.16 32.07 81.44 36.15 51.89 93.59 81.70 77.34 71.72 82.07 72.28 62.24 102.74 51.44 44.38 54.40 112.86
    2008 66.25 93.10 122.13 74.22 42.15 91.52 31.35 51.95 94.63 91.75 76.40 71.78 82.17 92.38 72.28 82.84 51.57 51.46 52.15 102.91
    2009 67.28 99.15 152.40 75.23 52.23 99.60 32.64 57.20 91.67 98.79 81.46 70.86 92.27 102.48 62.32 72.95 51.66 61.52 54.16 112.96

    If you can help me out today I appreciate your help

    I have solution for above from first part is like below and I need another part of project solution help from you.

    data DIVIDEND;
    infile
    '/folders/myfolders/Exdata/homework/annual_dividends.txt';
    input year $ stock1 stock2 stock3 stock4 stock5 stock6 stock7 stock8 stock9 stock10 stock11 stock12 stock13 stock14 stock15 stock16 stock17 stock18 stock19 stock20;
    run;
    PROC PRINT DATA=DIVIDEND;
    RUN;

    data STOCK;
    infile
    '/folders/myfolders/Exdata/homework/annual_stock_price.txt';
    input year $ stock1 stock2 stock3 stock4 stock5 stock6 stock7 stock8 stock9 stock10 stock11 stock12 stock13 stock14 stock15 stock16 stock17 stock18 stock19 stock20;
    run;
    PROC PRINT DATA = STOCK;
    RUN;

    Pratima

    • Robert Allison
      Robert Allison on

      Perhaps this would be a better topic for the discussion forums in communities.sas.com (assuming your teacher allows you to get outside help on this school project?)

  7. calculate the percentage return of each stock,
    using equation return=(stock value of this year-stock value of last year)/stock value of last year*100
    and average return for that stock from year 2001-2009.

    • Robert Allison
      Robert Allison on

      Wouldn't you want to use the SAS percent format, rather than multiplying by 100?!? :)

  8. Excellent!!!!!!!!!!!!!!!!!11

    You made my day!!!!!!!!!

    As a new, new beginner SAS user, this and other tips are encouraging. Thank you so much, Robert.

  9. Pingback: Top 10 SAS Training Post blogs of 2013 | The SAS Training Post

  10. Why do you need length $50 for the labels that clearly have length 8? Wouldn't length $8 save space when a large dataset is processed? Since this is a percent relative to the total, its value cannot be larger than 100%.

    Also, I got the following:
    WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

    • P.S. I know how to fix the warning, I just don't like teaching examples that create warnings.

    • Robert Allison
      Robert Allison on

      When I create a text variable, I like to err on the side of making them a bit longer than needed, rather than taking a chance on making them too short. Memory is cheap these days, and that extra space can save you time & frustration down the road.

      Many times in my 20 years of programming, I have created text variables the minimal/shortest length, and then later gone back and edited the program to add some additional text to the variable, and part of the text got truncated (because the variable wasn't long enough). It often took a bit of trouble shooting to figure out what the problem was. By making the text variables a bit longer than I need them, I typically avoid this problem.

      • This is true, I agree some room is desirable, but length $50 is an over kill here. Also, long text values do not print well in standard proc print, because SAS allocates the full length to display these short values. In order to have compact tables, I often had to assign much shorter formats for output.

        Also, I was surprised that the length $50 did not matter for the graph labels.

        • Robert Allison
          Robert Allison on

          Certainly feel free to adapt the code to suit your needs! That's one of the great things about SAS - it gives each user total control over their own code :)

        • Robert Allison
          Robert Allison on

          Let me give one example of why I generally declare this kind of text variable longer than the immediate need...

          I often use pointlabel text and/or html mouse-over text to do ad-hoc trouble shooting and verification while I'm creating a graph. I will often temporarily add various other variables to the text (for example there might be a person's name, city, or other identifying information in the data). If I was using the minimum length for the text variable, then I'd have to remember to temporarily increase that length (and then shorten it again). And in the case of html hover-text, if the variable is too short, part of the text and closing quote might get chopped off ... which would make the hover text for that plot marker not work. This is an example of why I like to declare such variables longer than might be needed for the immediate use.

Back to Top