Computing 200 factorial: Using PROC IML as a really BIG calculator

10

Have you ever wanted to compute the exact value of a really big number such as 200! = 200*199*...*2*1? You can do it—if you're willing to put forth some programming effort. This blog post shows you how.

Jiangtang Hu's recent blog discusses his quest to compute large factorials in many programming languages. As he discovered, many languages that use standard double precision arithmetic do not compute n! for large n, whereas other software, such as Mathematica, succeed.

The reason is simple: 171! is greater than 1.79769e+308, which on many computers is the largest floating point number that can be represented in eight bytes. For example, the following statements print the maximum double precision value on my Windows PC:

proc iml;
MaxBig = constant("big");
print MaxBig;

Because of the limitation of double precision arithmetic, floating point computations can result in a run-time error. For example, the following factorial computation overflows in SAS software:

f = fact(200);  /** numerical overflow **/

Of course, there are ways around this limitation. The reason that software such as Maple and Mathematica succeed is that they implement special algorithms that enable them to compute results to arbitrary precision.

Computing something to "arbitrary precision" sounds like magic, but it's not. For example, I can use the SAS/IML language (or the DATA step, or C, or many other languages) to write an algorithm that computes the factorial of fairly large numbers. The following algorithm is naive and inefficient, but it shows how you can implement an arbitrary precision algorithm by storing the result in a numerical array instead of in a double precision variable.

An Algorithm for Computing the Factorial of Large Numbers

Again, I emphasize that this algorithm is for illustration purposes and is not intended for "real" work. Given an integer, n, here's how you can compute the exact value of f = n!. The key is to store each digit of the result in a numerical array.

  1. Compute the number of digits of f = n!. The number of digits is related to the logarithm (base 10) of f, so you can use the LFACT function. The LFACT function returns the natural logarithm of the f, but you can convert that number to the base-10 logarithm.
  2. Allocate an array with the correct number of digits. The ith element of this array will hold the ith digit of the answer. In other words, if the ith digit is d, the digit represents the value d x 10i.
  3. Initialize the result to 1.
  4. For each integer 2, 3, ..., n, multiply the current result by the next factor. This is done place-value-by-place-value, and results that are greater than 10 are carried over to the next place value.
  5. After all the multiplications, the array f has the value of n!. The numbers in this array need to be reversed, because, for example, the array {0 2 1} represents the number 120.
start BigFactorial(n);
 numDigits = ceil(lfact(n)/log(10)); /** 1 **/
 f = j(1,numDigits,0);               /** 2 **/
 f[1] = 1;                           /** 3 **/
 do i = 2 to n;                      /** 4 **/
   carry = 0;
   digits = ceil(lfact(i)/log(10)); /** how many digits in the answer so far? **/
   do j = 1 to digits ;             /** multiply the product by next factor **/
     g = i*f[j] + carry;            /** j_th place value **/
     digit = mod(g, 10); 
     f[j] = digit;                  /** put value in the j_th place **/
     carry = (g - digit)/10;        /** carry the rest to the (j+1)_th place **/
   end;
 end;
 return( f[,ncol(f):1] );           /** 5. reverse the digits **/
finish;

Can this module really compute the factorial of large numbers? Yep. You can compute 200! and see that the exact answer contains 375 digits. The following statements compute the factorial and print the result:

f = BigFactorial(200);  /** compute 200! **/
/** convert answer from a numerical array to a character string **/
Fact200 = rowcat(char(f,1)); 
 
numDigits200 = nleng(Fact200); /** how many digits in the answer? **/
print numDigits200;
print Fact200;
                                  numDigits200
 
                                       375
 
 
                                     Fact200
 
     788657867364790503552363213932185062295135977687173263294742533244359449963
     403342920304284011984623904177212138919638830257642790242637105061926624952
     829931113462857270763317237396988943922445621451664240254033291864131227428
     294853277524242407573903240321257405579568660226031904170324062351700858796
     178922222789623703897374720000000000000000000000000000000000000000000000000

Obviously, I have entirely too much time on my hands, but does this example demonstrate anything else? It demonstrates that programming with arrays enables you to implement a wide variety of algorithms. Perhaps the example also demystifies software that evaluates expressions to "arbitrary precision."

Mostly, though, I did it because I thought it would be fun. I was right. And next time I'm at a party, I might casually sidle next to someone and mention that 1,000! has 2,568 digits. I'll let you know their response.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

10 Comments

  1. What a quick and coool work Rick!

    It is interesting to see how to use SAS to get 200!. Actually, my titled "too big to be accurate" post
    (series) will discuss how to implement approximation methods to get these big numbers.

  2. Chris Hemedinger on

    A few years ago there was big news in the mathematical world when someone computed the value of pi to a record number of digits (again). I copied the value (in text format) to a USB thumb drive and had it with me during a cocktail party (really!).

    Lesson: "Hey, I've got the most precise calculation for pi here in my pocket" is not as impressive an icebreaker as you might think.

  3. Wonderful.

    My version of iml doesn't have the LFACT finction , so I used
    log(fact(n)). However, i couldn't get past 170!

    At 180!, my saslog said,

    802 f = BigF(180);
    ERROR: (execution) Invalid argument to function.
    . . . .

    NOTE: Exiting IML.
    NOTE: 8 workspace compresses.
    NOTE: PROCEDURE IML used (Total process time):
    real time 0.04 seconds
    cpu time 0.04 seconds

    So I guess I hit the limit of my laptop.

    Al

  4. It's not a laptop limit. I know LFACT is in SAS 9.2, so you must have an earlier version of SAS that doesn't contain LFACT. No problem. Since n! is a product, log(n!) is just a sum. You can write your own LFACT function, which I'll call LogFact in order to avoid confusion:

    start LogFact(n);
    return( sum(log(1:n)) );
    finish;

    Define and use the LogFact function instead of LFACT in the algorithm and you're good to go!

  5. How can I convert the result (FACT200) to the scientific notation to use it in calculations in the same program?

    thanks

    • Rick Wicklin

      I suppose this question is releated to the question you posted on the SAS Support Communities. The Support Communites are the right place to get an answer. I haven't weighed in because you've never said what you intend to do with this number. Factor it? Divide by it? Unless you are a researcher in number theory, numbers of this size usually indicate that you should change to a log scale. Statisticians use quantities like log-likelihood and log-determinants because numerical computations that overflow on the original scale succeed on a log scale.

  6. Michael Elliott on

    Have a question: Trying to compute the actual odds in the old, "infinite number of monkeys" saw. Given that the "average" Shakespearian work contains 205,679 letters or spaces for letters, and the standard keyboard gives you 93 options for filling them, what is 93X93X93...for 205,679 times?

    • Rick Wicklin

      93 raised to the 205,679th power is approximately the same as 10 raised to the 404,875th power, which is the number 1 followed by 404,875 zeros.
      In comparison, the age of the observable universe in seconds is about 4 x 10^17.

      The average person can write 2 or 3 zeros per second, meaning that this number would require between 37 and 56 hours to write it out. Another way to see how big it is: I can write about 40 zeros on a line of college-rule loose leaf paper and there are 33 lines on a page. This number would require 307 pages of loose-leaf paper to write it down.

  7. Pingback: On the number of bootstrap samples - The DO Loop

Leave A Reply

Back to Top