Truncating decimal numbers in SAS without rounding

28

paper-money-stackImagine making $50K a day out of thin air. Did you know that NASDAQ routinely processes around 10,000,000 trades a day? What if instead of rounding cents for each transaction, market makers truncated fractions of cents in the amount they owe you? Under the assumption that each transaction, on average, has half a cent that is usually rounded away, this would produce 10,000,000 x $0.005 = $50,000 and nobody would even notice it. I am not saying it's legal, but this example is just an illustration of the power of ordinary truncation.

Handpicked Related Content: Truncating vs rounding SAS time values

However, sometimes it is necessary to truncate displayed numeric values to a specified number of decimal places without rounding. For example, if we need to truncate 3.1415926 to 4 decimal places without rounding, the displayed number would be 3.1415 (as compared to the rounded number, 3.1416).

If you think you can truncate numeric values by applying SAS w.d format, think again.

Try running this SAS code:

data _null_;
   x=3.1415926;
   put x= 6.4;
run;

If you expect to get x=3.1415, you will be disappointed. Surprisingly, you will get x=3.1416, which means that SAS format does not truncate the number, but rounds it. Same is true for the DOLLARw.d and COMMAw.d formats.

After running into this problem, I thought to instead use a SAS function to truncate numbers. The TRUNC function comes to mind. Indeed, if you look up the SAS TRUNC function, you will find that it does truncate numeric values, but (surprise!) not to a specified number of decimal places; rather it truncates to a specified number of bytes, which is not the same thing for numerics. This may be useful for evaluating the precision of numeric values, but has no direct bearing on our problem of truncating numeric values to a set number of decimal places.

What turned to be interesting is that the Excel TRUNC function does exactly what we need – it truncates values to a set number of decimal places (removes decimals) without rounding:

truncating-decimal-numbers-in-sas-without-rounding
Truncating numbers in Excel with the TRUNC Function

In general, the technique of number truncation should be limited to reporting purposes when displayed numbers are required to appear truncated. Be careful not to apply truncation before using the numbers in calculations, as you might get some seriously inaccurate results, even worse than when you round numbers before calculations. Unless, of course, your goal is to get inaccurate results, which is quite an honorable goal in fraud detection, simulation and investigation.

I can see two possible solutions to number truncations:

Solution 1: Numeric truncation

Let’s say we need to truncate the following number X.XXXXXXX , keeping only the red digits (that is get rid of all decimal digits after the 4th decimal place).

We can do it in 3 steps:

  1. Multiply our number by 104, effectively making the decimals part of a whole number (shifting the decimal point 4 positions to the right).
  2. Apply INT() function that truncates the decimal portion, keeping only the whole portion from the previous step.
  3. Divide the result of step 2 by 104, effectively restoring the order disturbed by step 1 (shifting the decimal point 4 positions to the left).

Here is SAS code implementing this algorithm:

%let d = 4; /* d must be a whole number: 0, 1, 2... */
 
data _null_;
   x = 3.1415926;
   p = 10**&d;
   y = int(x*p)/p;
   put x= / y=;
run;

If we run this code SAS log will show the following (expected and desired) results:

x=3.1415926
y=3.1415

WARNING: While in the SAS code example above the int() function might be substituted with the floor() function, for negative numbers the floor() function would produce incorrect results. For negative numbers, the ceil() function is the correct choice. However, the int() function does exactly what we need for both positive and negative numbers.

Solution 2: Character truncation

We can also solve our truncation problem by converting numeric value into character, and then use character functions to get rid of extra digits. The result can be converted back into numeric if necessary. Let’s solve the same problem, to truncate the following number X.XXXXXXX keeping only the red digits.

Using this character approach we can also do it in 3 steps:

  1. Convert the numeric value into a character value and assign it to a new character variable.
  2. Determine position of a decimal point in the character string.
  3. Sub-string our initial character string to keep only 4 characters after decimal point.

Here is SAS code implementing this algorithm:

%let d = 4; /* d must be a whole number: 0, 1, 2... */
 
data _null_;
   x = 3.1415926;
   y = put(x,best.);
   y = substr(y,1,find(y,'.')+&d);
   put x= / y=;
run;

If we run this code SAS log will show the following results:

x=3.1415926
y=3.1415

As you can see, these results are correct and identical to the results of numeric truncation.

Both numeric and character truncation methods work for positive and negative numbers.

User-defined functions

We can also implement the above two methods as user-defined functions, say truncn() and truncc(), using PROC FCMP:

proc fcmp outlib = sasuser.functions.truncations;
   function truncn (x,d);
      p = 10**d;
      y = int(x*p)/p;
      return(y);
   endsub;
 
   function truncc (x,d);
      y = put(x,best.);
      y = substr(y,1,find(y,'.')+d);
      return(y);
   endsub;
run;

Then we can use those user-defined functions truncn() and truncc() as illustrated in the following SAS code sample:

options cmplib=sasuser.functions;
 
data A;
   length x n 8 c $9;
   input x;
   n = truncn(x,4);
   c = truncc(x,4);
   datalines;
3.1415926
-3.1415926
;

This code will produce the following A dataset:

truncating-decimal-numbers-in-sas-without-rounding-02

Notice that variables x and n are of numeric type while variable c is of character type.

Questions

  1. Which of these two methods of decimal number truncation do you like more? Why?
  2. Does it make sense to use these methods as user-defined functions? Why?
Handpicked Related Content: Truncating vs rounding SAS time values
Share

About Author

Leonid Batkhan

Leonid Batkhan is a long-time SAS consultant and blogger. Currently, he is a Lead Applications Developer at F.N.B. Corporation. He holds a Ph.D. in Computer Science and Automatic Control Systems and has been a SAS user for more than 25 years. From 1995 to 2021 he worked as a Data Management and Business Intelligence consultant at SAS Institute. During his career, Leonid has successfully implemented dozens of SAS applications and projects in various industries. All posts by Leonid Batkhan >>>

28 Comments

  1. Gabriel Kavadas on

    Hello,
    I have an age value for ~ 3200 different individuals. I've converted the age from months to years which leads to having most individuals with an age in yrs with a decimal point.
    ex. 20.33, 32.234, 23.553, 44.224 etc...

    Is it possible to truncate all of the values for this variable?
    ex. 20.331 –> 20, 32.234 –> 32, 23.553 –> 23, 44.224 –> 44 etc...

    • Leonid Batkhan

      Yes, Gabriel, you can do it with the methods described in this post. Just define parameter d=0. However, since you need to truncate all decimals to arrive to a whole number and all your numbers are positive, you can simply use SAS' FLOOR( ) function which does exactly what you need.

  2. George Simler on

    I'll be honest. I seem to have a bug here.

    I have a variable value of a=629.44 (even if displayed with a format of 30.15.) When I do term1a=int(a*10**2) I get 62939. Its making no sense to me at all.

    • George Simler on

      Oops...let me correct. I get a value of term1a=(int(a*10**2) to be 62943. Had a typo in my original response. I cannot figure out why this would be happening.

      • Leonid Batkhan

        Hi George,
        I was not able to reproduce your problem. Here is the code that I ran:

        data  _null_;
           a=629.44;
           term1a=int(a*10**2);
           put a= term1a= 30.15;
        run;
        

        I am getting the following results in the SAS log:

        a=629.44 term1a=62944.000000000000000
        

        Could you please share your SAS code and information on the system you are running SAS on?

        • George Simler on

          I am running SAS 9.4 on a windows 10 platform. I can't really reproduce the code with a simple data _null_ unfortunately as its with an imported dataset.

          But here is the code I ran demonstrating this:

          data test;
             set input_data;
             _000_val_term1 = _000_value*10**2;    /*(this gives me a value of 62944 according to viewing the value in the dataset) */
             _000_val_term2 = int(_000_val_term1);  /* this gives m a value of 62943 according to viewing the value in the dataset) */
          run;
          

          I have zero idea why the heck this would be happening. If I look at the data in whatever format.

          I did fix it THIS way for my application to work (essentially I had to match calculated values to "hard-coded" variables from a different dataset that only has two decimals:

          _000_val_term1 = put(round(_000_value*10**2,.01), 30.3);   /* "best." did not work here */
          _000_val_term2 = int(_000_val_term1+0);
          

          So I had to combine your techniques for this to work for me. Weird right?

    • Leonid Batkhan

      This is exactly what this blog post talks about. Try using one of the described here methods (numeric truncation or character truncation), where your number of shown decimals d=1. For example, here is the code snippet using numeric truncation (I inserted your numbers after the datalines statement):

      %let d = 1; /* d must be a whole number: 0, 1, 2... */
       
      data _null_;
         input x;
         p = 10**&d;
         y = int(x*p)/p;
         put x= @12 y=;
         datalines;
      0.99939
      0.345
      1.893
      5.732
      9.19832
      ;
      

      If you run this code in SAS, you will get the following in the SAS log (y values are your desired truncated numbers):

      x=0.99939  y=0.9
      x=0.345    y=0.3
      x=1.893    y=1.8
      x=5.732    y=5.7
      x=9.19832  y=9.1
      

  3. The truncation and rounding function are just some toolings.
    When processing a lot of numbers the real practical issue is caused by not understanding the effects precison losing some digits by accident.
    For science where that difference doesn't matter, who cares. For accounting and bookkeeping (the common BI usage) really problematic as not being accepted by intended customers.

    Any way to overcome those?

    • Leonid Batkhan

      Ja, I don't quite understand your question. To overcome what? Truncation and rounding that I am talking about in this blog post are just are two different ways of representing output results; they have nothing to do with losing precision due to finite computer digit representation.

  4. Thank you for sharing. We have been living with rounding when we actually desired truncating.

    Now, how can we truncate time intervals? For instance, we have 07:30:35, but we want to show 07:30?

    Similarly, how can we round 07:50:20 downward to, say, 07:30 (that is, the half-hour interval)?

  5. Jonas V. Bilenas on

    I stand corrected. You need at least one or more spaces before the first digit selector for the negative value PREFIX to show in the output (truncated or rounded). Here is the modified code with 4 spaces before the fist digit selector for negative values. Results, by default are truncated.

    proc format;
    picture trunc low - < 0 = ' 9.9999' (prefix='-')
    0 - high = '9.9999'
    ;

    data _null_;
    x=3.1415926;
    y=-3.1415926;
    put y= trunc7.4 / x= trunc7.4;
    run;

    y=-3.1415
    x=3.1415

    Note that I modified the format call as trunc7.4 but that is optional (works with trunc.).

    Even after writing "The Power Of PROC FORMAT" book years ago I still am learning new stuff on PICTURE formats.

  6. Jonas V. Bilenas on

    You can try to use a picture format which, by default, truncates values.

    proc format;
    picture trunc low - < 0 = ' 9.9999' (prefix='-')
    0 - high = ' 9.9999'
    ;
    run;

    data _null_;
    x=3.1415926;
    y=-1*x;
    put x= 6.4
    /
    x=trunc.
    /
    y=7.4
    /
    y=trunc.
    ;
    run;

    x=3.1416
    x=3.1415
    y=-3.1416
    y=-3.1415

    Interesting that I needed a space after the first quote in the set up of digit selectorsto get the negative value displayed. To round results use the option (ROUND) after the name of the PICTURE FORMAT which did not require the leading space in SAS 9.2 on UNIX.

    • Leonid Batkhan

      Thank you, Jonas, for your alternative solution. The picture format works, but it's not quite flexible. One would have to define multiple picture formats for each number of decimal places to keep. In your example you used 9.9999 which only takes care of 4 decimal places. If you need 3 or 5, you would need to create other picture formats using 9.999 and 9.99999 respectively. What I am describing in this blog post has the number of decimal places as a parameter, so no re-programming needed for any number of decimal places to keep.

  7. Richard Baumann on

    Very good material! I would lean towards the first one (truncn) because it is more robust to handling numbers that have no decimal points. When truncc encounters an integer, it returns a blank. However, in truncn, when the value of d (number of decimal points) is moderately large (e.g. 5), then because truncn multiplies and then divides by a larger multiplicand and divisor (e.g. 10*5), the precision of the result returned by truncn is compromised (e.g., note the value 4.09999 returned by truncn in the example below). A third approach is to convert the incoming numeric value to a character string, as in truncc, but then parse it with a regular expression that allows for the original value to be returned if no decimal point is found, i.e.

       Function trunc_regex (num_value, decimals) $;
            Char_value = put (num_value, best.);
            Regex_ID = prxparse ("s/^(\s*-?\d*\.\d{" || strip (put (decimals, best.)) || "})\d*/$1/");
            Char_value = prxchange (regex_ID, 1, char_value);
            Return (char_value);
        Endsub;
    

    Here is an example comparing all three methods (with the parts of the regex elucidated):

    1          Proc fcmp outlib = work.functions.truncs;
    2          
    3              Function truncn (x,d);
    4          
    5                  p = 10**d;
    6                  y = int(x*p)/p;
    7          
    8                  return(y);
    9          
    10             Endsub;
    11
    

             
    12             Function truncc (x, d) $;
    13         
    14                 Y = put(x,best.);
    15         
    16         /*    uncomment these lines to make truncc able to handle integers    */
    17                 Decimal_point_location = find (y, ".");
    18         
    19                 If decimal_point_location
    20                 then return (substr (y, 1, decimal_point_position + d));
    21                 else return (y);
    22         */
    23                 Y = substr(y,1,find(y,'.')+d);
    24         
    25                 Return (y);
    26         
    
    27             Endsub;
    28  
    

           
    29             Function trunc_regex (num_value, decimals) $;
    30         
    31                 Char_value = put (num_value, best.);
    32         
    33                 Regex_ID = prxparse
    34         
    35                     (
    36                             "s"         /*  substitution operator       */
    37         
    38                         ||  "/"         /*  beginning of regex          */
    39         
    40                         ||          "^"     /*  anchor the matching pattern at the beginning of
    41                                                 the passed value    */
    42         
    43                         ||      "("         /*  beginning of capture buffer     */
    44         
    45                         ||          "\s*"   /*  allow for leading spaces in the value of
    46                                                 char_value     */
    47         
    48                         ||          "-?"    /*  optionally followed by a minus sign     */
    49         
    50                         ||          "\d*"   /*  followed by any number (including 0) of numeric digits     */
    51         
    52                         ||          "\."    /*  followed by a decimal point     */
    53         
    54                         ||          "\d{" || strip (put (decimals, best.)) || "}"   /*  followed
    55         
    56                                                 by the specified number of numeric digits after the
    57                                                 decimal point  */
    58         
    59                         ||      ")"         /*  end of capture buffer   */
    60         
    61                         ||      "\d*"       /*  followed by any additional numeric digits -- these
    62                                                 are not in the capture buffer    */
    63         
    64                         ||  "/"         /*  regex  delimiter   */
    65         
    66                         ||      "$1"        /*  substitute the contents of the capture buffer   */
    67         
    
    68                         ||  "/"         /*  end of regex    */
    69                     );
    70         
    71                 Char_value = prxchange (regex_ID, 1, char_value);
    72         
    73                 Return (char_value);
    74         
    75             Endsub;
    76         
    77             Run;
    

    NOTE: Function trunc_regex saved to work.functions.truncs.
    NOTE: Function truncc saved to work.functions.truncs.
    NOTE: Function truncn saved to work.functions.truncs.
    NOTE: PROCEDURE FCMP used (Total process time):
          real time           0.07 seconds
          cpu time            0.07 seconds
          
    
    78         
    79         Options cmplib = functions;
    80         
    81         Data _null_;
    82         
    83             Input value;
    84         
    85             Truncn_result = truncn (value, 5);
    86             Truncc_result = truncc (value, 5);
    87             Regex_result = trunc_regex (value, 5);
    88         
    89             Put / truncn_result =;
    90             Put truncc_result =;
    91             Put regex_result =;
    92         
    93         Datalines;
    
    Truncn_result=3.14159
    Truncc_result=3.14159
    Regex_result=3.14159
    
    Truncn_result=-3.14159
    Truncc_result=-3.14159
    Regex_result=-3.14159
    
    Truncn_result=4.09999
    Truncc_result=4.1
    Regex_result=4.1
    
    Truncn_result=-4.43
    Truncc_result=-4.43
    Regex_result=-4.43
    
    Truncn_result=7
    Truncc_result= 
    Regex_result=7
    
    Truncn_result=-7
    Truncc_result= 
    Regex_result=-7
    
    Truncn_result=0.2
    Truncc_result=0.2
    Regex_result=0.2
    
    

    • Leonid Batkhan

      Thank you, Richard, for your detailed comment. Great points and great addition!

      In this blog post I focused on a narrower problem: truncating excessive decimal digits. You broadened it and suggested a robust and comprehensive solution. I think with truncn() function we can do it simpler though. If one needs to keep exact number of decimal places, then applying SAS w.d format in combination with and after truncn() function would do just that. For example, the following code produces n=200.0000:

      data _null_;
        x = 200;
        n = truncn(x,4);
        put n= 10.4;
      run;
      

  8. Leonid,

    Another great technical article! You make truncation look easy as PI:-)!

    I like Method 1 for truncating decimal values and plan to use it. I also like the idea of creating a user function. That way, the code to perform the truncation is in a single spot and does not need to be plopped into my code everywhere I need it.

    Looking forward to your next article!

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top