No matter what statistical programming language you use, be careful of testing for an exact value of a floating-point number. This is known in the world of numerical analysis as "10.0 times 0.1 is hardly ever 1.0" (Kernighan and Plauger, 1974, *The Elements of Programming Style*).

There are many examples of arithmetic operations with finite precision numbers that can serve as cautionary tales, but the one that I saw recently was a DO loop in SAS that looked something like the following:

data A; x = 0; do i = 1 to 20 until(x=1); /* sometimes WRONG */ x = x + 0.1; output; end; run; |

Of course, the original code didn't have a comment that shows where the error is!
The programmer clearly intends for the loop to exit after 10 iterations when `x` is 1. However, because of finite precision arithmetic and the fact that `0.1` is not representable in binary, the value of `x` is never exactly equal to 1.

Practice "defensive programming": don't use exact comparisons in IF-THEN, DO-UNTIL, or DO-WHILE statements. A better way to code the previous loop is to use an inequality. In this example, the iteration could stop when `x` is greater than 1 – 0.1 / 2 = 0.95. If you don't know the exact amount that `x` will be incremented, use some quantity related to "machine epsilon," as follows:

data A; x = 0; eps = constant("SQRTMACEPS"); /* or 100*constant("MACEPS") ... */ do i = 1 to 20 until(x > 1-eps); x = x + 0.1; output; end; run; |

As I've said, these examples are ubiquitous in scientific programming, and this is a popular topic on "tip of the day" services such as @RLangTip on Twitter. So regardless of the programming language that you use, avoid the exact comparison of floating-point numbers.

## 9 Comments

eps = constant("MACEPS") works. Why would you use 100*constant("MACEPS")? I'm guessing it depends on the physical machine / OS?

The constant you should use depends on how close the approximate and exact answers should be before you declare "success" for the numerical approximation. For a simple addition, MACEPS will often work. For something like a numerical solution of a system of linear equations, A*x=b, you might want a factor of 1000*MACEPS or, even better, a factor of k*MACEPS where k depends on the condition number of the matrix A. There is no single constant that will work for all situations.

Hi Rick,

I am struggling with a floating point issue right now..I am trying to flag a set of records using their meidan value as the cutoff and one record which has exact value as the median is falling in Greater than median value bucket..to understand what is going on, I tried to do the difference betwen the two vars and am getting it as -2.22045E-16..

Thanks,

R

If the median values are random (like in a simulation), it won't matter where any one value falls, but you can prevent the situation you describe by using the ideas in this blog post. Instead of comparing

if x < value then...

you can compare

if x < value+epsilon then...

where epsilon is a suitable small number.

Thank you! it worked!!

Pingback: Define functions with default parameter values in SAS/IML - The DO Loop

Pingback: Vectors that have a fractional number of elements

Hi Rick ,

am struggling with my code i want to avoid compare floating point numbers ,but my code double am using here floating point number through am comparing ,In place of floating point number i will use integer number point any problem can u please reply

I suggest you post an example (data and sample code) to a SAS Support Community.