In statistical tables in SAS, a dot (.) represents a numerical missing value. Although a dot is the default symbol in SAS, other languages use other symbols. The R language prints the symbol NA, which stands for "not available." The MATLAB language uses NaN ("Not a Number"). In Python, many programmers use the built-in None data type, but there is some variation: the Pandas package uses both NaN and None for missing values, whereas the NumPy package uses NaN. These values are also used by programmers when defining data, such as in the SAS expression x = .;
In SAS you can control how missing values are displayed in tables:
- The MISSING system option on the global OPTIONS statement enables you to specify a new character to use in SAS tables. Some popular choices are a blank (" "), the letter 'M', a hyphen ("-"), and an asterisk ("*").
- The FORMAT procedure enables you to specify an arbitrary string to display for a missing value. Popular choices include "N/A", "Missing", "Don't Know", and other choices.
The MISSING system option
SAS supports many system options, which you can set by using the global OPTIONS statement. The MISSING option specifies a single character to use to display numeric missing values. You can use PROC OPTIONS to determine the value on your session of SAS, and whether the value is the default SAS value or was set in a configuration file at system start-up:
/* find the current value for the system option MISSING */ proc options option=MISSING value; run; |
Option Value Information For SAS Option MISSING Value: . Scope: Default How option value set: Shipped Default |
On my system, the value of the MISSING option is the dot. This is the default value as shipped in SAS software.
To demonstrate what this means, let's create some data that has missing values and use PROC PRINT to display the data in a table:
data Have; input x @@; datalines; 1 . 3 . 5 ; proc print data=Have; run; |
Two observations contain missing values. These values are displayed by using a dot. The next section shows how to override the default and display a different symbol for a missing value.
Get and reset the symbol for a missing value
You can use the OPTION MISSING= statement to specify a new symbol to represent how missing values are displayed in tables. Before I overwrite a system default, I like to save a copy of the current option so that I can restore the option later, if I choose. This is especially important if you are writing a macro or program that will be used by others. It is considered "rude" to change someone's system options, so I suggest that you store the current option and then restore it before your macro or program exits.
You can use the GETOPTION function to get the value of a SAS system option. You can call the GETOPTION function in the SAS DATA step and use the SYMPUT call to store it in a macro, or you can skip the DATA step and wrap the call to GETOPTION in a %SYSFUNC macro call, as follows:
/* get the current value and save it in a macro variable */ %let CurrMissVal = %sysfunc(getoption(MISSING)); %put &=CurrMissVal; |
CURRMISSVAL=. |
You can then use the OPTIONS statement to set a new value for the MISSING option, such as a hyphen character. When you run PROC PRINT, the missing values are displayed by using a hyphen instead of a dot. After calling PROC PRINT, you can restore the previous value of the MISSING option, if you desire:
options MISSING='-'; /* override the current option (for one character) */ proc print data=Have; run; options MISSING=&CurrMissVal; /* reset the previous option */ |
After changing the MISSING system option, missing values display as a hyphen. You can experiment with other symbols, such as an asterisk ('*'). The symbol is used to display numerical missing values in all SAS tables, even those created by statistical procedures.
Use a format to display missing values
It is often convenient to use a single character to represent a missing value in a statistical table. However, if you are creating a report for management or a client, you might want to use a more informative representation, such as 'N/A' or 'Missing'. You can use PROC FORMAT to define a custom format in which missing values are displayed by whatever character string you specify. For example, the following statements define a format in which missing values are displayed as 'N/A':
/* define a custom FORMAT to display numerical missing as 'N/A' */ proc format; value MISSPRNT .='N/A'; run; proc print data=Have; format x MISSPRNT.; run; |
Notice that you must use the format for any variable that contains a missing value. This is might be inconvenient if your report contains many variables, although you can try using the statement format _numeric_ MISSPRNT.; In contrast, changing the global MISSING system option is much easier, and it changes the display of missing values for all numerical variables in the SAS session.
Summary
By default, SAS displays a dot (.) to represent a numerical missing value. You can use the OPTIONS MISSING= statement to specify a new character to use in SAS tables. Or, if you want to create a report in which missing values are represented by a string of characters (such as "N/A"), you can use PROC FORMAT to specify the string.
Further Reading
- Bost, C. (2014) "What You’re Missing About Missing Values", Proceedings of the SAS® Global Forum 2014 Conference.
- Foley, M. (2005) "MISSING VALUES: Everything You Ever Wanted to Know", Proceedings of the WUSS Conference, 2005.
1 Comment
Pingback: Special missing values in SAS statistical tables - The DO Loop