Programming tip: Avoid testing floating-point values for equality

11

No matter what statistical programming language you use, be careful of testing for an exact value of a floating-point number. This is known in the world of numerical analysis as "10.0 times 0.1 is hardly ever 1.0" (Kernighan and Plauger, 1974, The Elements of Programming Style).

There are many examples of arithmetic operations with finite precision numbers that can serve as cautionary tales, but the one that I saw recently was a DO loop in SAS that looked something like the following:

data A;
x = 0;
do i = 1 to 20 until(x=1); /* sometimes WRONG */
   x = x + 0.1;
   output;
end;
run;

Of course, the original code didn't have a comment that shows where the error is! The programmer clearly intends for the loop to exit after 10 iterations when x is 1. However, because of finite precision arithmetic and the fact that 0.1 is not representable in binary, the value of x is never exactly equal to 1.

Practice "defensive programming": don't use exact comparisons in IF-THEN, DO-UNTIL, or DO-WHILE statements. A better way to code the previous loop is to use an inequality. In this example, the iteration could stop when x is greater than 1 – 0.1 / 2 = 0.95. If you don't know the exact amount that x will be incremented, use some quantity related to "machine epsilon," as follows:

data A;
x = 0;
eps = constant("SQRTMACEPS"); /* or 100*constant("MACEPS") ... */
do i = 1 to 20 until(x > 1-eps);
   x = x + 0.1;
   output;
end;
run;

As I've said, these examples are ubiquitous in scientific programming, and this is a popular topic on "tip of the day" services such as @RLangTip on Twitter. So regardless of the programming language that you use, avoid the exact comparison of floating-point numbers.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

11 Comments

  1. eps = constant("MACEPS") works. Why would you use 100*constant("MACEPS")? I'm guessing it depends on the physical machine / OS?

    • Rick Wicklin

      The constant you should use depends on how close the approximate and exact answers should be before you declare "success" for the numerical approximation. For a simple addition, MACEPS will often work. For something like a numerical solution of a system of linear equations, A*x=b, you might want a factor of 1000*MACEPS or, even better, a factor of k*MACEPS where k depends on the condition number of the matrix A. There is no single constant that will work for all situations.

  2. Hi Rick,
    I am struggling with a floating point issue right now..I am trying to flag a set of records using their meidan value as the cutoff and one record which has exact value as the median is falling in Greater than median value bucket..to understand what is going on, I tried to do the difference betwen the two vars and am getting it as -2.22045E-16..
    Thanks,
    R

    • Rick Wicklin

      If the median values are random (like in a simulation), it won't matter where any one value falls, but you can prevent the situation you describe by using the ideas in this blog post. Instead of comparing
      if x < value then...
      you can compare
      if x < value+epsilon then...
      where epsilon is a suitable small number.

  3. Pingback: Define functions with default parameter values in SAS/IML - The DO Loop

  4. Pingback: Vectors that have a fractional number of elements

  5. Hi Rick ,
    am struggling with my code i want to avoid compare floating point numbers ,but my code double am using here floating point number through am comparing ,In place of floating point number i will use integer number point any problem can u please reply

  6. Peter Lancashire on

    Thanks for this post, which has just solved my problem, which is to set the SINGULAR value in PROC MIXED to something a bit smaller.. I tried to find this in the SAS documentation but failed. "Machine epsilon" is mentioned many times in the statistical procedures, but searching in the documentation for "machine epsilon" does not find the CONSTANT function because it is described differently there, as "Machine precision constant". The parameter is 'MACEPS', so whoever wrote the page must know. I find this deeply frustrating. Please improve the documentation.

Leave A Reply

Back to Top