In SAS, the reserved keyword _NULL_ specifies a SAS data set that has no observations and no variables. When you specify _NULL_ as the name of an output data set, the output is not written. The _NULL_ data set is often used when you want to execute DATA step code that displays a result, defines a macro variable, writes a text file, or makes calls to the EXECUTE subroutine. In those cases, you are interested in the "side effect" of the DATA step and rarely want to write a data set to disk. This article presents six ways to use the _NULL_ data set. Because the _NULL_ keyword is used, no data set is created on disk.
#1. Use SAS as a giant calculator
You can compute a quantity in a DATA _NULL_ step and then use the PUT statement to output the answer to the SAS log. For example, the following DATA step evaluates the normal density function at x-0.5 when μ=1 and σ=2. The computation is performed twice: first using the built-in PDF function and again by using the formula for the normal density function. The SAS log shows that the answer is 0.193 in both cases.
data _NULL_; mu = 1; sigma = 2; x = 0.5; pdf = pdf("Normal", x, mu, sigma); y = exp(-(x-mu)**2 / (2*sigma**2)) / sqrt(2*constant('pi')*sigma**2); put (pdf y) (=5.3); run;
#2. Display characteristics of a data set
You can use a null DATA step to display characteristics of a data set. For example, the following DATA step uses the PUT statement to display the number of numeric and character variables in the Sashelp.Class data set. No data set is created.
data _NULL_; set Sashelp.Class; array char[*} $ _CHAR_; array num[*} _NUMERIC_; nCharVar = dim(char); nNumerVar = dim(num); put "Sashelp.Class: " nCharVar= nNumerVar= ; stop; /* stop processing after first observation */ run;
Sashelp.Class: nCharVar=2 nNumerVar=3
You can also store these values in a macro variable, as shown in the next section.
#3. Create a macro variable from a value in a data set
You can use the SYMPUT or SYMPUTX subroutines to create a SAS macro variable from a value in a SAS data set. For example, suppose you run a SAS procedure that computes some statistic in a table. Sometimes the procedure supports an option to create an output data that contains the statistic. Other times you might need to use the ODS OUTPUT statement to write the table to a SAS data set. Regardless of how the statistic gets in a data set, you can use a DATA _NULL_ step to read the data set and store the value as a macro variable.
The following statements illustrate this technique. PROC MEANS creates a table called Summary, which contains the means of all numerical variables in the Sashelp.Class data. The ODS OUTPUT statement writes the Summary table to a SAS data set called Means. The DATA _NULL_ step finds the row for the Height variable and creates a macro variable called MeanHeight that contains the statistic. You can use that macro variable in subsequent steps of your analysis.
proc means data=Sashelp.Class mean stackods; ods output Summary = Means; run; data _NULL_; set Means; /* use PROC CONTENTS to determine the columns are named Variable and Mean */ if Variable="Height" then call symputx("MeanHeight", Mean); run; %put &=MeanHeight;
For a second example, see the article "What is a factoid in SAS," which shows how to perform the same technique with a factoid table.
#4. Create macro variable from a computational result
Sometimes there is no procedure that computes the quantity that you want, or you prefer to compute the quantity yourself. The following DATA _NULL_ step counts the number of complete cases for the numerical variables in the Sashelp.Heart data. It then displays the number of complete cases and the percent of complete cases in the data. You can obtain the same results if you use PROC MI and look at the MissPattern table.
data _NULL_; set Sashelp.Heart end=eof nobs=nobs; NumCompleteCases + (nmiss(of _NUMERIC_) = 0); /* increment if all variables are nonmissing */ if eof then do; /* when all observations have been read ... */ PctComplete = NumCompleteCases / nobs; /* ... find the percentage */ put NumCompleteCases= PctComplete= PERCENT7.1; end; run;
#5. Edit a text file or ODS template "on the fly"
This is a favorite technique of Warren Kuhfeld, who is a master of writing a DATA _NULL_ step that modifies an ODS template. In fact, this technique is at the heart of the %MODSTYLE macro and the SAS macros that modify the Kaplan-Meier survival plot.
Although I am not as proficient as Warren, I wrote a blog post that introduces this template modification technique. The DATA _NULL_ step is used to modify an ODS template. It then uses CALL EXECUTE to run PROC TEMPLATE to compile the modified template.
#6. A debugging tool
All the previous tips use _NULL_ as the name of a data set that is not written to disk. It is a curious fact that you can use the _NULL_ data set in almost every SAS statement that expects a data set name!
For example, you can read from the _NULL_ data set. Although reading zero observations is not always useful, one application is to check the syntax of your SAS code. Another application is to check whether a procedure is installed on your system. For example, you can run the statements PROC ARIMA data=_NULL_; quit; to check whether you have access to the ARIMA procedure.
A third application is to use _NULL_ to suppress debugging output. During the development and debugging phase of your development, you might want to use PROC PRINT, PROC CONTENTS, and PROC MEANS to ensure that your program is working as intended. However, too much output can be a distraction, so sometimes I direct the debugging output to the _NULL_ data set where, of course, it magically vanishes! For example, the following DATA step subsets the Sashelp.Cars data. I might be unsure as to whether I created the subset correctly. If so, I can use PROC CONTENTS and PROC MEANS to display information about the subset, as follows:
data Cars; set Sashelp.Cars(keep=Type _NUMERIC_); if Type in ('Sedan', 'Sports', 'SUV', 'Truck'); /* subsetting IF statement */ run; /* FOR DEBUGGING ONLY */ %let DebugName = Cars; /* use _NULL_ to turn off debugging output */ proc contents data=&DebugName short; run; proc means data=&DebugName N Min Max; run;
If I don't want to this output (but I want the option to see it again later), I can modify the DebugName macro (%let DebugName = _NULL_;) so that the CONTENTS and MEANS procedures do not produce any output. If I do that and rerun the program, the program does not create any debugging output. However, I can easily restore the debugging output whenever I want.
In summary, the _NULL_ data set name is a valuable tool for SAS programmers. You can perform computations, create macro variables, and manipulate text files without creating a data set on disk. Although I didn't cover it in this article, you can use DATA _NULL_ in conjunction with ODS for creating customized tables and reports.
What is your favorite application of using the _NULL_ data set? Leave a comment.