Five constants every statistical programmer should know

7
Statistical programmers need to access numerical constants that help us to write robust and accurate programs. Specifically, it is necessary to know when it is safe to perform numerical operations such as raising a number to a power without exceeding the largest number that is representable in finite-precision arithmetic. This article discusses five constants that every statistical programmer must know: PI, MACEPS, EXACTINT, BIG, and SMALL. In the SAS language, you can find the values of these constants by using the CONSTANT function in Base SAS. The following table shows the values of the constants that are discussed in this article:
data Constants;
length Const $10;
format Value Best12.;
input Const @@;
Value = constant(Const);
datalines;
PI MACEPS EXACTINT 
BIG LOGBIG SQRTBIG SMALL
;
 
proc print noobs; run;

Pi and other mathematical constants

The CONSTANT function provides values for various mathematical constants. Of these, the most important constant is π ≈ 3.14159..., but you can also obtain other mathematical constants such as the natural base (e) and the golden ratio (φ). To get an approximation for π, use pi = constant('pi'). The number π is essential for working with angles and trigonometric functions. For an example, see the article, "Polygons, pi, and linear approximations," which uses π to create regular polygons.

Machine epsilon

Arguably the most important constant in computer science is machine epsilon. This number enables you to perform floating-point comparisons such as deciding whether two numbers are equal or determining the rank of a numerical matrix. To get machine epsilon, use eps = constant('maceps').

The largest representable integer

It is important to know the largest number that can be represented accurately in finite precision on your machine. This number is available as bigInt = constant('exactint'). This number enables you to find the largest factorial number that you can compute (18!) and the largest row of Pascal's triangle (56) that you can faithfully represent in double precision (56).

The largest representable double

Except for π, the constants I use the most are related to BIG ≈ 1.8E308. I primarily use LOGBIG and SQRTBIG, which you can compute as logbig = constant('logbig'); sqrtbig = constant('sqrtbig'); These constants are useful for preventing overflow when you perform arithmetic operations on large numbers:
  • The quantity exp(x) can only be computed when x is less than LOGBIG ≈ 709.78.
  • The LOGBIG option supports the BASE suboption (BASE > 1), which you can use to ensure that raising a number to a power does not overflow. For example, constant('logbig', 2) returns 1024 because 2**1024 is the largest power of 2 that does not exceed BIG.
  • The SQRTBIG option tells you whether you can square a number without overflowing.

The smallest representable double

What is the smallest positive floating-point number on your computer? It is given by small = constant('small'). There are also LOGSMALL and SQRTSMALL versions, which you can use to prevent overflow. I don't use these constants as frequently as their 'BIG' counterparts. In my experience, underflow is usually not a problem in SAS.

Summary

This article discusses five constants that every statistical programmer must know: PI, MACEPS, EXACTINT, BIG, and SMALL. Whether you need mathematical constants (such as π) will depend on the programs that you write. The MACEPS constant is used to compare floating-point numbers. The other constants are used by computer scientists and numerical analysts to ensure that programs can correctly compute with very large (or very small) numbers without encountering floating-point overflow (or underflow).
Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

7 Comments

    • Rick Wicklin

      You probably meant the question as a joke, but statistical programmers rarely need the value of e itself. Rather, the constant e shows its importance every time you use the EXP and LOG functions.

  1. Warren F Kuhfeld on

    Nice post as always, Rick! During my days as a SAS/STAT developer, I used many of the constants that the CONSTANT function provides. Of course, as a developer, we accessed them in a different way. Perhaps you could convince our friends in BASE/SAS documentation to provide a program that provides an easy way for users to see all of the constants that the function provides since they can't all be documented because values can radically change on different systems.

    options missing= ' ';
    data Constants;
    length Const $16;
    format Value Base Best16.;
    input Const @@;
    if index(Const, "EXACT") then do;
    do Bytes = 2 to 8;
    Value = constant(Const, bytes);
    output;
    end;
    end;
    else if index(Const, "LOG") then do;
    do base = 2, constant('E'), 10;
    Value = constant(Const, base);
    output;
    end;
    end;
    else do;
    Value = constant(Const);
    output;
    end;
    datalines;
    E EULER GOLDEN PI EXACTINT BIG BIGRECIP LOGBIG LOGBIGRECIP SQRTBIG SMALL
    SMALLRECIP LOGSMALL LOGSMALLRECIP SQRTSMALL MACEPS LOGMACEPS SQRTMACEPS
    ;

    proc print noobs; run;

    • Rick Wicklin

      Very nice! I think that most SAS programmers are now using SAS on Linux or 64-bit Windows, so the constants are now the same for most systems. But, as you know, in the old days we had some hosts where, for example, BIG was much less than 1e308.

  2. Peter Lancashire on

    Thanks for the information. Is it not the job of the SAS developers to make SAS behave gracefully when one of these limits interferes with calculations? In many cases setting a result to missing and a status code to explain what happened can save a lot of defensive programming. That is what modern modular software development should bring us. Sadly, the error trapping and recovery mechanisms in SAS are primitive. Having try - catch - finally blocks would be a very welcome improvement.

    • Rick Wicklin

      Thanks for your comments. Yes, in many cases SAS returns a missing value when it encounters an overflow (or underflow). For example, the following program returns a missing value for EXP(800) and writes a note to the log:
      NOTE: Invalid argument to function EXP(800) at line NNN column 7.

      data Compute;
      input x @@;
      exp = exp(x);
      datalines;
      600 700 800
      ;
      proc print data=Compute;
      run;

      However, sometimes programmers want to trap invalid input arguments BEFORE they operate on them.

  3. Pingback: Approximate functions by using Taylor series and rational polynomials - The DO Loop

Leave A Reply

Back to Top