How does the IF-THEN statement in SAS treat a missing value?

Every programming language has an IF-THEN statement that branches according to whether a Boolean expression is true or false. In SAS, the IF-THEN (or IF-THEN/ELSE) statement evaluates an expression and braches according to whether the expression is nonzero (true) or zero (false). The basic syntax is

if numeric-expression then

One of the interesting features of the SAS language is that it is designed to handle missing values. This brings up the question: What happens if SAS encounters a missing value in an IF-THEN expression? Does the IF-THEN expression treat the missing value as "true" and execute the THEN statement, or does it treat the missing value as "false" and execute the alternative ELSE statement (if it exists)?

The answer is fully documented, but let's run an example to demonstrate the SAS behavior:

data A;
input x @@;
if x then Expr="True "; 
     else Expr="False";
1 0 .
proc print noobs; run;

Ah-ha! SAS interprets a missing value as "false." More correctly, here is an excerpt from the SAS documentation:

SAS evaluates the expression in an IF-THEN statement to produce a result that is either non-zero, zero, or missing. A non-zero and nonmissing result causes the expression to be true; a result of zero or missing causes the expression to be false.

This treatment of missing values is handled consistently by other SAS languages and in other conditional statements. For example, the CHOOSE function in the SAS/IML language is a vector alternative to the IF-THEN/ELSE statement, but it handles missing values by using the same rules:

proc iml;
x  = {1, 0, .};
Expr = choose(x,"True","False");
print x Expr;

The output is identical to the previous output from the DATA step and PROC PRINT.

If you do not want missing values to be treated as "false," then do not reference a variable directly, but instead use a Boolean expression in the IF-THEN statement. For example, in the following statement a missing value results in the THEN statement being executed, whereas all other numerical values continue to behave as expected:

if x^=0 then ...;

Have you encountered places in SAS where missing values are handled in a surprising way? Post your favorite example in the comments.

tags: Getting Started, SAS Programming


  1. Max
    Posted July 8, 2013 at 10:15 am | Permalink

    The obvious "surprising way" is that SAS considers missing values to be less than non-missing numeric values (ie .<0=TRUE), which I'm sure screws up almost everyone's program at least once before they learn their lesson.

    Also, the missing() function is nice because it can handle either character or numeric.

  2. Jean Slosek
    Posted July 16, 2013 at 2:59 pm | Permalink

    A very simple way to address this is at the top of your code... include a tight arguement that addresses what to do when a missing value is encountered. This can be done also when you are using more than one variable, for for the example I'll keep it simple.


    Data revised;
    set mydata;
    if age = . or age gt 120 then age_group=.;
    else if 0<=age<=5 then age_group='0-5 yrs'
    else if 6<=age<=18 then age_group='6-18 yrs'
    else if age gt 18 then age_group='19+ yrs';

    • Jean Slosek
      Posted July 16, 2013 at 3:01 pm | Permalink

      Sorry I forgot to add the run; at end but I think that would be obvious to most.

      • Posted July 16, 2013 at 3:13 pm | Permalink

        Thanks for the comment. Missing values and out-of-range values come up a lot. An alternative approach for your example is to use PROC FORMAT to define a user-defined format on AGE, rather than create a new variable.

  3. Dave Houg
    Posted July 16, 2013 at 4:09 pm | Permalink

    Short memory tip: You have to get to one to be true. Sorta like dating, missing a date doesn't get you a date, being a big fat zero doesn't get you date.

  4. David Pasta
    Posted July 17, 2013 at 12:55 am | Permalink

    The problem with code like
    if age = . or age gt 120 then age_group=.;
    is that it doesn't handle other missing values such as .A or ._, so I greatly prefer
    if missing(age) or age gt 120 then age_group=.;

    I also object to the commonly-used
    if . < age < 0 then ...
    or the slightly more correct
    if .z < age < 0 then ...
    preferring the explicit
    if not missing(age) and age < 0 then ...
    which doesn't require arcane knowledge of the internal representation of missing values in SAS (including the ordering of special missing values).

    In short, if you want to test for missingness, do not test for equality to . but instead test for missingness.

  5. Anders Sköllermo
    Posted July 18, 2013 at 6:25 am | Permalink

    Hi! Please note the effect of Minus values.
    data A;
    input x @@;
    if x then Expr="True ";
    else Expr="False";
    1 -1 0 .

    proc print noobs; run;

    gives the result

    x Expr
    1 True
    -1 True
    0 False
    . False

    Perhaps the line -1 True is a surprise to some persons.

  6. Anders Sköllermo
    Posted July 18, 2013 at 6:36 am | Permalink

    Hi! A mistake I made recently. Question: How to find it easily ?
    This is not a SAS error - it is a programming mistake, which is perhaps not easy to see if you are tired. The effect in this case is "radical".

    data A;
    input x @@;
    if x then; Expr="True "; /* The left-most semicolon added by mistake. */
    1 0 .

    proc print noobs; run;

    x Expr
    1 True
    0 True
    . True

    / Br Anders

  7. Mounika
    Posted July 29, 2013 at 6:51 am | Permalink

    data A;
    input x @@;
    if not x then Expr="True ";
    else Expr="False";
    1 0 .
    proc print noobs;

    here answer is: False True True.

    can't we say SAS interprets missing value as True?

    please clarify.

  8. Posted July 29, 2013 at 7:02 am | Permalink

    No. A missing value is treated as FALSE, as shown and explained in the blog post.

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>