I have previously discussed how to define functions that safely evaluate their arguments and return a missing value if the argument is not in the domain of the function. The canonical example is the LOG function, which is defined only for positive arguments. For example, to evaluate the LOG function on a sequence of (possibly non-positive) values, you can use the following IF-THEN/ELSE logic:
data Log; input x @@; if x > 0 then logX = log(x); else logX = .; datalines; -1 1 2 . 0 10 ; proc print; run; |
On SAS discussion forums, I sometimes see questions from people who try to use the IFN function to accomplish the same logic. That is, in place of the IF-THEN/ELSE logic, they try to use the following one-liner in the DATA step:
logX = ifn(x>0, log(x), .); |
Although this looks like the same logic, there is a subtle difference. All three arguments to the IFN function are evaluated BEFORE the function is called, and the results of the evaluation are then passed to the function. For example, if x= -1, then the SAS DATA step does the following:
- Evaluate the Boolean expression x>0. When x= -1, the expression evaluates to 0.
-
Evaluate the second argument log(x). This happens regardless of the result of evaluating the first expression. When x= -1, the expression log(x) is invalid and the SAS log will report
NOTE: Invalid argument to function LOG
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. - Call the IFN function as logX = IFN(0, ., .), which results in assigning a missing value to logX.
In Step 2, SAS evaluates log(x) unconditionally for every value of x, which leads to out-of-domain errors when x is not positive. This is exactly the situation that the programmer was trying to avoid! In contrast, the IF-THEN/ELSE logic only evaluates log(x) when x is positive. Consequently, the SAS log is clean when you use the IF-THEN/ELSE statement.
There are plenty of situations for which the IFN function (and its cousin, the IFC function) are useful, but for testing out-of-domain conditions, use IF-THEN/ELSE logic instead.
13 Comments
Isn’t this a bug in the IFN function? Why does it evaluate the second condition when the first one is met? Wouldn't fixing the IFN function make more sense than dancing around its quirky behavior?
It is not a bug. When you call ANY function in SAS, ALL arguments are evaluated (if necessary). Then those numbers or character values are passed to the function. That is the way functions work. The IFN function doesn't get called until AFTER the logical statement (x>0) and the LOG function are evaluated.
In the SAS/IML function, the CHOOSE function behaves similarly. See my 2011 blog post on the CHOOSE function.
Is it possible to use this function as a loop that ends once the condition is met? I am trying to use this in sas egp. If not, would you be able to suggest another function that does that?
I don't now what 'sas egp' is, but you can use the DO WHILE or DO UNTIL statements to loop until a condition is met.
This is normal behaviour for functions in most programming languages. That is why C has the ternary operator whose definition is that only one of the expressions is evaluated.
.
It looks like this: logX = x>0 ? log(x) : .;
.
How about extending SAS for all us C (and many other languages) programmers?
In this case, at the cost of second ifn, you could replace x in the log with something like ifn (x > 0, x, 1) so that the log never sees anything it can't use.
True. The most compact inline solution is probably logX = ifn(x>0, log(abs(x)), .), which results in a clean SAS Log that does not contain any notes about domain errors. However, the IF-THEN/ELSE statement is more efficient and easier to understand.
Excellent explanation of why not to use IFN as an attempt to avoid out-of-domain errors.
In the interest of completeness, it seems the logX = ifn(x>0, log(abs(x)), .) approach could still pose an issue for the special case of x = 0. We could use a second IFN to get around that snag, such as logX = ifn(x>0, log(ifn(x>0, x, 1)), .), but this just reinforces how much more efficient and readable the IF-THEN/ELSE alternative is.
Thanks, Rick. I've just reviewed the IFN and IFC functions. I like your final solution, because avoiding 'normally-occurring' errors and warnings is good.
And here I didn't like IFC and IFN just because it looked too much like the Excel IF statement (which I have hated since the first time I was forced to use it). Nice to know that I have a slightly more valid reason other than readable code to avoid it.
And the log(abs(x)) only solves one specific type of runtime error. CDF, INVCDF, PDF and a other functions have there own issues with differing range issues.
Exactly my point: Don't attempt to use IFN to circumvent an out-of-domain error.
I met the same problem. thank you for your explanation.
Pingback: Dividing by zero with SAS - myths and realities - SAS Users