Imagine making $50K a day out of thin air. Did you know that NASDAQ routinely processes around 10,000,000 trades a day? What if instead of rounding cents for each transaction, market makers truncated fractions of cents in the amount they owe you? Under the assumption that each transaction, on average, has half a cent that is usually rounded away, this would produce 10,000,000 x $0.005 = $50,000 and nobody would even notice it. I am not saying it's legal, but this example is just an illustration of the power of ordinary truncation.
Handpicked Related Content: Truncating vs rounding SAS time valuesHowever, sometimes it is necessary to truncate displayed numeric values to a specified number of decimal places without rounding. For example, if we need to truncate 3.1415926 to 4 decimal places without rounding, the displayed number would be 3.1415 (as compared to the rounded number, 3.1416).
If you think you can truncate numeric values by applying SAS w.d format, think again.
Try running this SAS code:
data _null_; x=3.1415926; put x= 6.4; run; |
If you expect to get x=3.1415, you will be disappointed. Surprisingly, you will get x=3.1416, which means that SAS format does not truncate the number, but rounds it. Same is true for the DOLLARw.d and COMMAw.d formats.
After running into this problem, I thought to instead use a SAS function to truncate numbers. The TRUNC function comes to mind. Indeed, if you look up the SAS TRUNC function, you will find that it does truncate numeric values, but (surprise!) not to a specified number of decimal places; rather it truncates to a specified number of bytes, which is not the same thing for numerics. This may be useful for evaluating the precision of numeric values, but has no direct bearing on our problem of truncating numeric values to a set number of decimal places.
What turned to be interesting is that the Excel TRUNC function does exactly what we need – it truncates values to a set number of decimal places (removes decimals) without rounding:
In general, the technique of number truncation should be limited to reporting purposes when displayed numbers are required to appear truncated. Be careful not to apply truncation before using the numbers in calculations, as you might get some seriously inaccurate results, even worse than when you round numbers before calculations. Unless, of course, your goal is to get inaccurate results, which is quite an honorable goal in fraud detection, simulation and investigation.
I can see two possible solutions to number truncations:
Solution 1: Numeric truncation
Let’s say we need to truncate the following number X.XXXXXXX , keeping only the red digits (that is get rid of all decimal digits after the 4th decimal place).
We can do it in 3 steps:
- Multiply our number by 104, effectively making the decimals part of a whole number (shifting the decimal point 4 positions to the right).
- Apply INT() function that truncates the decimal portion, keeping only the whole portion from the previous step.
- Divide the result of step 2 by 104, effectively restoring the order disturbed by step 1 (shifting the decimal point 4 positions to the left).
Here is SAS code implementing this algorithm:
%let d = 4; /* d must be a whole number: 0, 1, 2... */ data _null_; x = 3.1415926; p = 10**&d; y = int(x*p)/p; put x= / y=; run; |
If we run this code SAS log will show the following (expected and desired) results:
x=3.1415926
y=3.1415
WARNING: While in the SAS code example above the int() function might be substituted with the floor() function, for negative numbers the floor() function would produce incorrect results. For negative numbers, the ceil() function is the correct choice. However, the int() function does exactly what we need for both positive and negative numbers.
Solution 2: Character truncation
We can also solve our truncation problem by converting numeric value into character, and then use character functions to get rid of extra digits. The result can be converted back into numeric if necessary. Let’s solve the same problem, to truncate the following number X.XXXXXXX keeping only the red digits.
Using this character approach we can also do it in 3 steps:
- Convert the numeric value into a character value and assign it to a new character variable.
- Determine position of a decimal point in the character string.
- Sub-string our initial character string to keep only 4 characters after decimal point.
Here is SAS code implementing this algorithm:
%let d = 4; /* d must be a whole number: 0, 1, 2... */ data _null_; x = 3.1415926; y = put(x,best.); y = substr(y,1,find(y,'.')+&d); put x= / y=; run; |
If we run this code SAS log will show the following results:
x=3.1415926
y=3.1415
As you can see, these results are correct and identical to the results of numeric truncation.
Both numeric and character truncation methods work for positive and negative numbers.
User-defined functions
We can also implement the above two methods as user-defined functions, say truncn() and truncc(), using PROC FCMP:
proc fcmp outlib = sasuser.functions.truncations; function truncn (x,d); p = 10**d; y = int(x*p)/p; return(y); endsub; function truncc (x,d); y = put(x,best.); y = substr(y,1,find(y,'.')+d); return(y); endsub; run; |
Then we can use those user-defined functions truncn() and truncc() as illustrated in the following SAS code sample:
options cmplib=sasuser.functions; data A; length x n 8 c $9; input x; n = truncn(x,4); c = truncc(x,4); datalines; 3.1415926 -3.1415926 ; |
This code will produce the following A dataset:
Notice that variables x and n are of numeric type while variable c is of character type.
Questions
- Which of these two methods of decimal number truncation do you like more? Why?
- Does it make sense to use these methods as user-defined functions? Why?
28 Comments
Hello,
I have an age value for ~ 3200 different individuals. I've converted the age from months to years which leads to having most individuals with an age in yrs with a decimal point.
ex. 20.33, 32.234, 23.553, 44.224 etc...
Is it possible to truncate all of the values for this variable?
ex. 20.331 –> 20, 32.234 –> 32, 23.553 –> 23, 44.224 –> 44 etc...
Yes, Gabriel, you can do it with the methods described in this post. Just define parameter d=0. However, since you need to truncate all decimals to arrive to a whole number and all your numbers are positive, you can simply use SAS' FLOOR( ) function which does exactly what you need.
I'll be honest. I seem to have a bug here.
I have a variable value of a=629.44 (even if displayed with a format of 30.15.) When I do term1a=int(a*10**2) I get 62939. Its making no sense to me at all.
Oops...let me correct. I get a value of term1a=(int(a*10**2) to be 62943. Had a typo in my original response. I cannot figure out why this would be happening.
Hi George,
I was not able to reproduce your problem. Here is the code that I ran:
I am getting the following results in the SAS log:
Could you please share your SAS code and information on the system you are running SAS on?
I am running SAS 9.4 on a windows 10 platform. I can't really reproduce the code with a simple data _null_ unfortunately as its with an imported dataset.
But here is the code I ran demonstrating this:
I have zero idea why the heck this would be happening. If I look at the data in whatever format.
I did fix it THIS way for my application to work (essentially I had to match calculated values to "hard-coded" variables from a different dataset that only has two decimals:
So I had to combine your techniques for this to work for me. Weird right?
Hi George, I guess the value has precision issue in the data source or during your importing process. You can try applying fuzz() function to your imported data to reduce some noise of the data precision variation.
Nice One Leonid..
0.99939
0.345
1.893
5.732
9.19832
i want to print 0.9, 0.3,1.8,5.7,9.1
This is exactly what this blog post talks about. Try using one of the described here methods (numeric truncation or character truncation), where your number of shown decimals d=1. For example, here is the code snippet using numeric truncation (I inserted your numbers after the datalines statement):
If you run this code in SAS, you will get the following in the SAS log (y values are your desired truncated numbers):
The truncation and rounding function are just some toolings.
When processing a lot of numbers the real practical issue is caused by not understanding the effects precison losing some digits by accident.
For science where that difference doesn't matter, who cares. For accounting and bookkeeping (the common BI usage) really problematic as not being accepted by intended customers.
Any way to overcome those?
Ja, I don't quite understand your question. To overcome what? Truncation and rounding that I am talking about in this blog post are just are two different ways of representing output results; they have nothing to do with losing precision due to finite computer digit representation.
Thank you for sharing. We have been living with rounding when we actually desired truncating.
Now, how can we truncate time intervals? For instance, we have 07:30:35, but we want to show 07:30?
Similarly, how can we round 07:50:20 downward to, say, 07:30 (that is, the half-hour interval)?
Perry, you can truncate time values by applying TIMEw.d format - see my blog post Truncating vs rounding SAS time values.
As for rounding time values to various rounding points, see this SAS code Sample 31109: Unusual Uses of the Round Function.
I stand corrected. You need at least one or more spaces before the first digit selector for the negative value PREFIX to show in the output (truncated or rounded). Here is the modified code with 4 spaces before the fist digit selector for negative values. Results, by default are truncated.
proc format;
picture trunc low - < 0 = ' 9.9999' (prefix='-')
0 - high = '9.9999'
;
data _null_;
x=3.1415926;
y=-3.1415926;
put y= trunc7.4 / x= trunc7.4;
run;
y=-3.1415
x=3.1415
Note that I modified the format call as trunc7.4 but that is optional (works with trunc.).
Even after writing "The Power Of PROC FORMAT" book years ago I still am learning new stuff on PICTURE formats.
Jonas, thank you for the correction. Agree, SAS formats are very powerful, and I also keep learning various applications of the SAS formats. You may take a look at my earlier post Automatic data suppression in SAS reports for some creative usages of the SAS formats.
You can try to use a picture format which, by default, truncates values.
proc format;
picture trunc low - < 0 = ' 9.9999' (prefix='-')
0 - high = ' 9.9999'
;
run;
data _null_;
x=3.1415926;
y=-1*x;
put x= 6.4
/
x=trunc.
/
y=7.4
/
y=trunc.
;
run;
x=3.1416
x=3.1415
y=-3.1416
y=-3.1415
Interesting that I needed a space after the first quote in the set up of digit selectorsto get the negative value displayed. To round results use the option (ROUND) after the name of the PICTURE FORMAT which did not require the leading space in SAS 9.2 on UNIX.
Thank you, Jonas, for your alternative solution. The picture format works, but it's not quite flexible. One would have to define multiple picture formats for each number of decimal places to keep. In your example you used 9.9999 which only takes care of 4 decimal places. If you need 3 or 5, you would need to create other picture formats using 9.999 and 9.99999 respectively. What I am describing in this blog post has the number of decimal places as a parameter, so no re-programming needed for any number of decimal places to keep.
You are correct on that.
Very good material! I would lean towards the first one (truncn) because it is more robust to handling numbers that have no decimal points. When truncc encounters an integer, it returns a blank. However, in truncn, when the value of d (number of decimal points) is moderately large (e.g. 5), then because truncn multiplies and then divides by a larger multiplicand and divisor (e.g. 10*5), the precision of the result returned by truncn is compromised (e.g., note the value 4.09999 returned by truncn in the example below). A third approach is to convert the incoming numeric value to a character string, as in truncc, but then parse it with a regular expression that allows for the original value to be returned if no decimal point is found, i.e.
Here is an example comparing all three methods (with the parts of the regex elucidated):
Thank you, Richard, for your detailed comment. Great points and great addition!
In this blog post I focused on a narrower problem: truncating excessive decimal digits. You broadened it and suggested a robust and comprehensive solution. I think with truncn() function we can do it simpler though. If one needs to keep exact number of decimal places, then applying SAS w.d format in combination with and after truncn() function would do just that. For example, the following code produces n=200.0000:
Leonid,
Another great technical article! You make truncation look easy as PI:-)!
I like Method 1 for truncating decimal values and plan to use it. I also like the idea of creating a user function. That way, the code to perform the truncation is in a single spot and does not need to be plopped into my code everywhere I need it.
Looking forward to your next article!
Thank you, Michael, your comments are always greatly appreciated. Looking forward to reading your new book It Only Hurts When I Hit <ENTER>.
For more information about SAS rounding functions, see the article "Rounding Up, rounding down," which includes a discussion of rounding towards/away from zero.
Thank you, Rick, for your complementary reference.
Hi, how to get the decimal values separately as an output? For example:12.3456
How to get only 3456 as output?
Hi Sowmya,
You can easily extract decimal digits, for example:
Thanku so much