The IFN function versus the IF-THEN/ELSE statement in SAS

9

I have previously discussed how to define functions that safely evaluate their arguments and return a missing value if the argument is not in the domain of the function. The canonical example is the LOG function, which is defined only for positive arguments. For example, to evaluate the LOG function on a sequence of (possibly non-positive) values, you can use the following IF-THEN/ELSE logic:

data Log;
input x @@;
if x > 0 then 
   logX = log(x);
else 
   logX = .;
datalines;
-1 1 2 . 0 10  
;
 
proc print; run;

On SAS discussion forums, I sometimes see questions from people who try to use the IFN function to accomplish the same logic. That is, in place of the IF-THEN/ELSE logic, they try to use the following one-liner in the DATA step:

logX = ifn(x>0, log(x), .);

Although this looks like the same logic, there is a subtle difference. All three arguments to the IFN function are evaluated BEFORE the function is called, and the results of the evaluation are then passed to the function. For example, if x= -1, then the SAS DATA step does the following:

  1. Evaluate the Boolean expression x>0. When x= -1, the expression evaluates to 0.
  2. Evaluate the second argument log(x). This happens regardless of the result of evaluating the first expression. When x= -1, the expression log(x) is invalid and the SAS log will report
    NOTE: Invalid argument to function LOG
    NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.
  3. Call the IFN function as logX = IFN(0, ., .), which results in assigning a missing value to logX.

In Step 2, SAS evaluates log(x) unconditionally for every value of x, which leads to out-of-domain errors when x is not positive. This is exactly the situation that the programmer was trying to avoid! In contrast, the IF-THEN/ELSE logic only evaluates log(x) when x is positive. Consequently, the SAS log is clean when you use the IF-THEN/ELSE statement.

There are plenty of situations for which the IFN function (and its cousin, the IFC function) are useful, but for testing out-of-domain conditions, use IF-THEN/ELSE logic instead.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

9 Comments

  1. Isn’t this a bug in the IFN function? Why does it evaluate the second condition when the first one is met? Wouldn't fixing the IFN function make more sense than dancing around its quirky behavior?

    • Rick Wicklin

      It is not a bug. When you call ANY function in SAS, ALL arguments are evaluated (if necessary). Then those numbers or character values are passed to the function. That is the way functions work. The IFN function doesn't get called until AFTER the logical statement (x>0) and the LOG function are evaluated.

      In the SAS/IML function, the CHOOSE function behaves similarly. See my 2011 blog post on the CHOOSE function.

  2. Peter Lancashire on

    This is normal behaviour for functions in most programming languages. That is why C has the ternary operator whose definition is that only one of the expressions is evaluated.
    .
    It looks like this: logX = x>0 ? log(x) : .;
    .
    How about extending SAS for all us C (and many other languages) programmers?

  3. Tim Simmons on

    In this case, at the cost of second ifn, you could replace x in the log with something like ifn (x > 0, x, 1) so that the log never sees anything it can't use.

    • Rick Wicklin

      True. The most compact inline solution is probably logX = ifn(x>0, log(abs(x)), .), which results in a clean SAS Log that does not contain any notes about domain errors. However, the IF-THEN/ELSE statement is more efficient and easier to understand.

      • Excellent explanation of why not to use IFN as an attempt to avoid out-of-domain errors.

        In the interest of completeness, it seems the logX = ifn(x>0, log(abs(x)), .) approach could still pose an issue for the special case of x = 0. We could use a second IFN to get around that snag, such as logX = ifn(x>0, log(ifn(x>0, x, 1)), .), but this just reinforces how much more efficient and readable the IF-THEN/ELSE alternative is.

  4. Thanks, Rick. I've just reviewed the IFN and IFC functions. I like your final solution, because avoiding 'normally-occurring' errors and warnings is good.

  5. Edward Ballard on

    And here I didn't like IFC and IFN just because it looked too much like the Excel IF statement (which I have hated since the first time I was forced to use it). Nice to know that I have a slightly more valid reason other than readable code to avoid it.

    And the log(abs(x)) only solves one specific type of runtime error. CDF, INVCDF, PDF and a other functions have there own issues with differing range issues.

Leave A Reply

Back to Top