How to handle percent (%) values in SAS

13

Being that 2013 is the International Year of Statistics, I wanted to make sure everyone knows how to handle my favorite statistic - percent (%) - in SAS!

I often see data in spreadsheets/csv/etc that purport to represent "percents"... but you have to be a bit careful when working with those values.  Are they the raw numeric values, or are they the formatted values?

In SAS, to represent 12.5%, you'll want to store the numeric value .125, and then apply the 'percent.' format so that it will print out (in tables, graphs, etc) as 12.5%.

Something like this...

data foo;
format y percent7.1;
y=.125;
run;

proc print data=foo;
run;

That was simple enough!... but what about a graph where you want to show the value rounded to a whole percent on the axis, but show more precision (let's say 2 decimal places) in the pointlabels for the markers in the plot???

Yes - you can do that in SAS!

I recommend using the simple format statement to control the graph's axis, and then create a special extra/temporary text variable containing the marker values, formatted to show a percent with 2 decimal places, to use as the pointlabels.

data my_data;
set my_data;
length custom_text $50;
custom_text=put(percent_value,percent8.2);
run;

symbol1 color=red value=circle height=6 pointlabel=(height=11pt color=blue "#custom_text");

proc gplot data=my_data;
format percent_value percent7.0;
plot percent_value*letter;
run;

Now the axis shows nice/rounded values (with no decimal places), and the pointlabels show 2 decimal places:)

 

 

Hopefully this simple example will help teach you the basics, so you can customize your graphs in endless ways!

Here is the complete SAS code for the above example.

 

Share.

About Author

Robert Allison

The Graph Guy!

Robert has worked at SAS for over 20 years, and is perhaps the foremost expert in creating custom graphs using SAS/GRAPH. His educational background is in Computer Science, and he holds a BS, MS, and PhD from NC State University. He is the author of several conference papers, has won a few graphic competitions, and has written a book (SAS/GRAPH: Beyond the Basics).

13 Comments

  1. Why do you need length $50 for the labels that clearly have length 8? Wouldn't length $8 save space when a large dataset is processed? Since this is a percent relative to the total, its value cannot be larger than 100%.

    Also, I got the following:
    WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

    • Robert Allison
      Robert Allison on

      When I create a text variable, I like to err on the side of making them a bit longer than needed, rather than taking a chance on making them too short. Memory is cheap these days, and that extra space can save you time & frustration down the road.

      Many times in my 20 years of programming, I have created text variables the minimal/shortest length, and then later gone back and edited the program to add some additional text to the variable, and part of the text got truncated (because the variable wasn't long enough). It often took a bit of trouble shooting to figure out what the problem was. By making the text variables a bit longer than I need them, I typically avoid this problem.

      • This is true, I agree some room is desirable, but length $50 is an over kill here. Also, long text values do not print well in standard proc print, because SAS allocates the full length to display these short values. In order to have compact tables, I often had to assign much shorter formats for output.

        Also, I was surprised that the length $50 did not matter for the graph labels.

        • Robert Allison
          Robert Allison on

          Certainly feel free to adapt the code to suit your needs! That's one of the great things about SAS - it gives each user total control over their own code :)

        • Robert Allison
          Robert Allison on

          Let me give one example of why I generally declare this kind of text variable longer than the immediate need...

          I often use pointlabel text and/or html mouse-over text to do ad-hoc trouble shooting and verification while I'm creating a graph. I will often temporarily add various other variables to the text (for example there might be a person's name, city, or other identifying information in the data). If I was using the minimum length for the text variable, then I'd have to remember to temporarily increase that length (and then shorten it again). And in the case of html hover-text, if the variable is too short, part of the text and closing quote might get chopped off ... which would make the hover text for that plot marker not work. This is an example of why I like to declare such variables longer than might be needed for the immediate use.

  2. Pingback: Top 10 SAS Training Post blogs of 2013 | The SAS Training Post

  3. Excellent!!!!!!!!!!!!!!!!!11

    You made my day!!!!!!!!!

    As a new, new beginner SAS user, this and other tips are encouraging. Thank you so much, Robert.

  4. calculate the percentage return of each stock,
    using equation return=(stock value of this year-stock value of last year)/stock value of last year*100
    and average return for that stock from year 2001-2009.

  5. Hi I have created two data set initially and I need to solve second part of my project from school
    this is project I do have

    Information of yearend stock value for 20 company during year 2000-2009 stored in a text file called 'annual_stock_price.txt'

    stock1 ='Microsoft'
    stock2 ='Apple Inc'
    stock3 ='Walmart'
    stock4 ='BJ'
    stock5 ='Costco'
    stock6 ='3M'
    stock7 ='Eli Lilly'
    stock8 ='Pfizer'
    stock9 ='Exelon'
    stock10='Toyota'

    stock11='Ford'
    stock12='GM'
    stock13='American Airline'
    stock14='Johnson & Johnson'
    stock15='Exxon'
    stock16='Georgia Pacific'
    stock17='American Standard'
    stock18='Northrop Gruman'
    stock19='Sears'
    stock20='Peco'

    *task1 - calculate the percentage return of each stock,
    using equation return=(stock value of this year-stock value of last year)/stock value of last year*100
    and average return for that stock from year 2001-2009.

    *task2 - calculate the grand average percentage return for all the stocks from year 2001-2009

    *task3 - Read in another file called 'annual_dividends.txt' containing dividends during year 2001-2009. Find out highest_dividend, lowest_dividend for each company.

    *task4 - generate report that listing stock name, percentage return from year 2001-2009, average return for that stock, comment, highest_dividend, lowest_dividend. The logic of defining variable 'comment' is as follows:
    if average return of a stock is smaller than the grand average percentage return then, comment='poor performance'
    else comment='good performance'

    and second set is Information of annual dividend for 20 companies during year 2001-2009 stored in a text file called 'annual_dividend.txt'

    stock1 ='Microsoft'
    stock2 ='Apple Inc'
    stock3 ='Walmart'
    stock4 ='BJ'
    stock5 ='Costco'
    stock6 ='3M'
    stock7 ='Eli Lilly'
    stock8 ='Pfizer'
    stock9 ='Exelon'
    stock10='Toyota'

    stock11='Ford'
    stock12='GM'
    stock13='American Airline'
    stock14='Johnson & Johnson'
    stock15='Exxon'
    stock16='Georgia Pacific'
    stock17='American Standard'
    stock18='Northrop Gruman'
    stock19='Sears'
    stock20='Peco'

    *task1 - get the highest dividend value and lowest dividend value for each company from year 2001-2009
    *task2 - and combine it to the report with percentage return, so that the report has the following items :
    company name
    percentage return from year 2001-2009
    average return for that company
    comment
    highest dividend value
    lowest dividend value
    Generate the report in RTF format and save to your PC.

    and there is two text files first file is annual_dividend.txt

    2001 1.60 2.28 2.12 1.68 1.53 1.12 0.67 1.52 1.18 1.18 1.10 1.14 1.33 1.76 1.74 2.30 0.93 0.84 0.92 2.00
    2002 1.65 2.50 2.16 1.68 1.63 1.16 0.69 1.60 1.24 1.26 1.14 1.20 1.43 1.84 1.86 2.38 0.97 0.89 1.00 2.16
    2003 1.70 2.63 2.16 1.68 1.73 1.21 0.75 1.68 1.30 1.36 1.18 1.28 1.55 1.92 1.98 2.46 1.07 0.97 1.08 2.32
    2004 1.75 2.70 2.16 1.68 1.83 1.27 0.80 1.74 1.37 1.46 1.22 1.38 1.69 2.00 2.09 2.56 1.16 1.08 1.16 2.48
    2005 1.80 2.86 2.16 1.68 1.91 1.32 0.90 1.78 1.45 1.56 1.26 1.52 1.83 2.08 2.16 2.57 1.19 1.18 1.24 2.64
    2006 1.86 2.94 2.18 1.68 1.99 1.37 1.00 1.83 1.52 1.63 1.29 1.66 1.96 2.18 2.20 2.66 1.29 1.30 1.32 2.77
    2007 2.12 3.02 2.23 1.68 2.07 1.44 1.15 1.89 1.59 1.70 1.34 1.72 2.07 2.28 2.24 2.74 1.44 1.38 1.40 2.86
    2008 2.60 3.10 2.30 1.68 2.15 1.52 1.35 1.95 1.63 1.75 1.40 1.78 2.17 2.38 2.28 2.84 1.57 1.46 1.50 2.91
    2009 2.82 3.16 2.40 1.76 2.23 1.60 1.64 2.00 1.67 1.79 1.46 1.86 2.27 2.48 2.32 2.95 1.66 1.52 1.60 2.96

    second Anuual_stock price.txt file

    2000 30.50 62.01 62.07 89.62 12.43 41.04 20.58 51.44 81.10 31.08 10.06 71.06 41.25 21.68 51.64 32.21 30.87 40.79 60.84 111.85
    2001 35.16 63.32 72.12 83.68 14.53 51.12 24.67 51.52 81.18 31.18 21.10 71.14 41.33 21.76 55.74 52.30 30.93 50.84 55.92 122.00
    2002 41.45 67.50 90.22 78.66 19.63 61.16 26.69 61.60 81.24 41.26 21.14 70.20 51.43 31.84 56.86 42.38 30.97 40.89 51.00 132.16
    2003 60.75 64.48 102.16 66.68 21.73 61.21 30.75 65.68 91.30 51.36 31.18 70.28 51.55 41.92 41.98 62.46 31.07 55.97 51.08 142.32
    2004 42.17 69.23 92.16 70.68 23.83 61.27 40.80 76.74 91.37 61.46 41.22 72.38 61.69 42.00 32.09 72.56 41.16 61.08 51.16 152.48
    2005 56.11 72.34 82.16 72.78 25.91 71.32 30.90 66.78 93.45 71.56 51.26 73.52 61.83 52.08 42.16 82.57 51.19 71.18 41.24 122.64
    2006 63.13 79.45 79.18 74.36 31.99 91.37 31.00 61.83 99.52 71.63 65.29 77.66 71.96 62.18 52.20 92.66 61.29 41.30 51.32 122.77
    2007 65.23 83.23 92.23 70.16 32.07 81.44 36.15 51.89 93.59 81.70 77.34 71.72 82.07 72.28 62.24 102.74 51.44 44.38 54.40 112.86
    2008 66.25 93.10 122.13 74.22 42.15 91.52 31.35 51.95 94.63 91.75 76.40 71.78 82.17 92.38 72.28 82.84 51.57 51.46 52.15 102.91
    2009 67.28 99.15 152.40 75.23 52.23 99.60 32.64 57.20 91.67 98.79 81.46 70.86 92.27 102.48 62.32 72.95 51.66 61.52 54.16 112.96

    If you can help me out today I appreciate your help

    I have solution for above from first part is like below and I need another part of project solution help from you.

    data DIVIDEND;
    infile
    '/folders/myfolders/Exdata/homework/annual_dividends.txt';
    input year $ stock1 stock2 stock3 stock4 stock5 stock6 stock7 stock8 stock9 stock10 stock11 stock12 stock13 stock14 stock15 stock16 stock17 stock18 stock19 stock20;
    run;
    PROC PRINT DATA=DIVIDEND;
    RUN;

    data STOCK;
    infile
    '/folders/myfolders/Exdata/homework/annual_stock_price.txt';
    input year $ stock1 stock2 stock3 stock4 stock5 stock6 stock7 stock8 stock9 stock10 stock11 stock12 stock13 stock14 stock15 stock16 stock17 stock18 stock19 stock20;
    run;
    PROC PRINT DATA = STOCK;
    RUN;

    Pratima

    • Robert Allison
      Robert Allison on

      Perhaps this would be a better topic for the discussion forums in communities.sas.com (assuming your teacher allows you to get outside help on this school project?)

  6. Pingback: How to handle time values in SAS | SAS Training

Leave A Reply

Back to Top