Two ways to specify how SAS displays missing values

1

In statistical tables in SAS, a dot (.) represents a numerical missing value. Although a dot is the default symbol in SAS, other languages use other symbols. The R language prints the symbol NA, which stands for "not available." The MATLAB language uses NaN ("Not a Number"). In Python, many programmers use the built-in None data type, but there is some variation: the Pandas package uses both NaN and None for missing values, whereas the NumPy package uses NaN. These values are also used by programmers when defining data, such as in the SAS expression x = .;

In SAS you can control how missing values are displayed in tables:

  • The MISSING system option on the global OPTIONS statement enables you to specify a new character to use in SAS tables. Some popular choices are a blank (" "), the letter 'M', a hyphen ("-"), and an asterisk ("*").
  • The FORMAT procedure enables you to specify an arbitrary string to display for a missing value. Popular choices include "N/A", "Missing", "Don't Know", and other choices.

The MISSING system option

SAS supports many system options, which you can set by using the global OPTIONS statement. The MISSING option specifies a single character to use to display numeric missing values. You can use PROC OPTIONS to determine the value on your session of SAS, and whether the value is the default SAS value or was set in a configuration file at system start-up:

/* find the current value for the system option MISSING */
proc options option=MISSING value;
run;
Option Value Information For SAS Option MISSING
    Value: .
    Scope: Default
    How option value set: Shipped Default

On my system, the value of the MISSING option is the dot. This is the default value as shipped in SAS software.

To demonstrate what this means, let's create some data that has missing values and use PROC PRINT to display the data in a table:

data Have;
input x @@;
datalines;
1 . 3 . 5
;
 
proc print data=Have; run;

Two observations contain missing values. These values are displayed by using a dot. The next section shows how to override the default and display a different symbol for a missing value.

Get and reset the symbol for a missing value

You can use the OPTION MISSING= statement to specify a new symbol to represent how missing values are displayed in tables. Before I overwrite a system default, I like to save a copy of the current option so that I can restore the option later, if I choose. This is especially important if you are writing a macro or program that will be used by others. It is considered "rude" to change someone's system options, so I suggest that you store the current option and then restore it before your macro or program exits.

You can use the GETOPTION function to get the value of a SAS system option. You can call the GETOPTION function in the SAS DATA step and use the SYMPUT call to store it in a macro, or you can skip the DATA step and wrap the call to GETOPTION in a %SYSFUNC macro call, as follows:

/* get the current value and save it in a macro variable */
%let CurrMissVal = %sysfunc(getoption(MISSING));
%put &=CurrMissVal;
CURRMISSVAL=.

You can then use the OPTIONS statement to set a new value for the MISSING option, such as a hyphen character. When you run PROC PRINT, the missing values are displayed by using a hyphen instead of a dot. After calling PROC PRINT, you can restore the previous value of the MISSING option, if you desire:

options MISSING='-';          /* override the current option (for one character) */
proc print data=Have; run;
options MISSING=&CurrMissVal; /* reset the previous option */

After changing the MISSING system option, missing values display as a hyphen. You can experiment with other symbols, such as an asterisk ('*'). The symbol is used to display numerical missing values in all SAS tables, even those created by statistical procedures.

Use a format to display missing values

It is often convenient to use a single character to represent a missing value in a statistical table. However, if you are creating a report for management or a client, you might want to use a more informative representation, such as 'N/A' or 'Missing'. You can use PROC FORMAT to define a custom format in which missing values are displayed by whatever character string you specify. For example, the following statements define a format in which missing values are displayed as 'N/A':

/* define a custom FORMAT to display numerical missing as 'N/A' */
proc format;
value MISSPRNT .='N/A';
run;
 
proc print data=Have; 
   format x MISSPRNT.;
run;

Notice that you must use the format for any variable that contains a missing value. This is might be inconvenient if your report contains many variables, although you can try using the statement format _numeric_ MISSPRNT.; In contrast, changing the global MISSING system option is much easier, and it changes the display of missing values for all numerical variables in the SAS session.

Summary

By default, SAS displays a dot (.) to represent a numerical missing value. You can use the OPTIONS MISSING= statement to specify a new character to use in SAS tables. Or, if you want to create a report in which missing values are represented by a string of characters (such as "N/A"), you can use PROC FORMAT to specify the string.

Further Reading

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

1 Comment

  1. Pingback: Special missing values in SAS statistical tables - The DO Loop

Leave A Reply

Back to Top