SAS® Macro Language immensely empowers SAS programmers with versatility and efficiency of their code development. It allows SAS users to modularize programming code with “once written – many times used” components, and in many cases automatically generate data-driven SAS code.
Related blog post: Multi-purpose macro function for getting information about data sets
Macro language and macro processor
Generally, SAS software processes your SAS program step by step, first scanning it for macro language objects - macro variables referenced as &somename, and macros referenced as %somename. If found, SAS software activates macro processor which resolves and substitutes those macro references according to the macro language syntax before SAS compiles and executes your programming steps.
SAS macro language vs. SAS programming language
A SAS program usually consists of two, often interwoven layers – macro layer and non-macro layer, each with its own syntax and its own timing of compilation and execution. In other words, SAS code is a combination of two distinct languages:
- SAS programming language (comprised of DATA steps, PROC steps and global statements such as LIBNAME, OPTIONS, TITLE etc.)
- SAS macro language (comprised of %LET, %IF, %DO, macro functions etc.) which is processed separately from and before SAS compiler executes SAS programming language code.
The difference between them is like a difference between cooking a meal and eating the meal. In this analogy meal=code, cooking=SAS macro language, eating=SAS programming language. Clear understanding of this difference is the key to becoming a successful SAS programmer.
Two types of SAS macros
There are two distinct types of SAS macros:
- Macros that generate some SAS programming language code which can span across SAS statements or steps;
- Macros that generate some string values which can be used as part of SAS programming language code or data values, but they are not complete SAS statements or steps. This type does not generate any SAS executable code, just a value.
What is a SAS macro function?
SAS macro function is a SAS macro that generates a value. In other words, it is the type 2 macro described above. As any SAS macros, SAS macro functions can have any number (zero or more) of positional or/and named parameters (arguments). SAS users may define their own macro functions, but in doing so you may not utilize any SAS language syntax; only SAS macro language syntax is allowed. You can use existing macro functions in your own macro function definition. Among others, one of the most powerful is %SYSFUNC macro function which brings a wealth of SAS language functions into SAS macro language.
Sources of SAS macro functions
SAS macro functions may come from the following three sources.
1. Pre-built macro functions
Pre-built macro functions that are part of the macro processor. These are such macro functions as %eval, %length, %quote, %scan, %str, %sysfunc, %upcase, etc. Here is a complete list of the pre-built SAS macro functions.
2. Auto-call macro functions
Auto-call macros, some of them are type 1 (macros), and some – type 2 (macro functions) such as %cmpres, %left, %lowcase, %trim, %verify, etc. These macro functions supplement the pre-built macro functions. The main difference from the pre-built macro functions is that the auto-call macro functions are program samples of the user-defined macro functions that are written in SAS macro language and made available to you without having to define or include them in your programs. The auto-call macro functions come with your SAS software installation and usually pre-configured for you by setting MAUTOSOURCE and SASAUTOS= macro system options. They may include several macro libraries depending on the SAS products licensed at your site. For example, for my SAS BASE installation the auto-call macro library is in the following folder:
C:\Program Files\SASHome\SASFoundation\9.4\core\sasmacro
Here is a selected list of auto-call macros provided with SAS software.
From the usage standpoint, you will not notice any difference between the pre-built and the auto-call macro functions. For example, macro function %upcase() is pre-built, while macro function %lowcase() is auto-call macro function. They belong to entirely different families, but we use them as if they are complementary siblings.
3. User-defined macro functions
Finally, there are user-defined macro functions that do not come with SAS installation. These are the macro functions that you define on your own. Usually, they are kept separately from the auto-call macros mainly in order to distinguish them from SAS-supplied ones.
To enable access to your own SAS macro library in addition to the auto-call macro library (or libraries), you can use the INSERT= system option:
options insert=(sasautos="path_to_your_own_macro_library_folder");
Instead of replacing the SASAUTOS value, this option inserts an additional value into the existing SASAUTOS option as the first value, thus allowing you to tap into your own macro library first, and then also into pre-set SAS auto-call libraries.
Creating user-defined macro function
Let’s consider the following example. Suppose, we want to create a macro function that takes a data set name as an argument and returns a value equal to the number of observations in that data set.
We know that the following code calculates the number of observations in a data set:
data _null_; call symputx('NOBS',n); stop; set SASHELP.CARS nobs=n; run; %put &=NOBS; NOBS=428 |
Can we create a SAS macro function by enclosing this code into macro? Something like this:
%macro nobs(dset=,result=); %global &result; data _null_; call symputx("&result",n); stop; set &dset nobs=n; run; %mend nobs; |
The answer is “No”. Yes, we created a valid macro; we can invoke this macro to produce the result:
%nobs(dset=SASHELP.CARS, result=NOBS); %put &=NOBS; NOBS=428 |
But this is not a macro function. Remember type 2 macro that does not generate any SAS programming language code, just a value? But this macro does generate SAS code which assigns a value to the macro variable specified as the second argument (result=NOBS).
In order to create a valid macro function, our macro should not have any SAS language code in it – neither a DATA step, nor a PROC step. It may only be comprised of the SAS macro language code. Here it is:
%macro nobs(dset); %local dsid n; %let dsid = %sysfunc(open(&dset)); %if &dsid %then %do; %let n = %sysfunc(attrn(&dsid,nlobs)); %let dsid = %sysfunc(close(&dsid)); %end; %else %put %sysfunc(sysmsg()); &n %mend nobs; |
When macro processor executes this macro, the only object that gets passed to the SAS language compiler is the value shown in the line right before the %mend. This is the calculated value of the number of observations (denoted by &n ). This is the only thing that is visible by the SAS language compiler, the rest is the macro language code visible and being handled by SAS macro processor.
IMPORTANT: When defining SAS macro function always use %local statement to list ALL macro variables that are created in your macro to ensure they will not accidentally overwrite same-named macro variables in the calling environment. You don’t need to declare %local for macro parameters as they are always local automatically.
SAS macro functions usage examples
When a macro function is defined this way, wherever you place its invocation %nobs(SASHELP.CARS) in your SAS code it will be evaluated and replaced with the corresponding value (in this case it is number 428) by the SAS macro processor. That way you can avoid substandard hard-coding and make your SAS code dynamic and powerful. You can use macro functions in many SAS coding contexts. For example:
- Assignment statements for macro variable: %let NOBS=%nobs(SASHELP.CARS);
- Assignment statement in a DATA step: x = %nobs(SASHELP.CARS);
- As part of an expression in a DATA step: x = %nobs(SASHELP.CARS)/2 + 3;
- As a value of a DATA step do loop: do i=1 to %nobs(SASHELP.CARS);
- As a value of a macro do loop: %do i=1 %to %nobs(SASHELP.CARS);
- As part of condition in IF statement: if %nobs(SASHELP.CARS) > 500 then do;
And so on. It's important to note that arguments (parameters) to macro functions must be either SAS constants or macro expressions resolving to SAS constants. Since macro functions get resolved by the macro processor before SAS program execution, the arguments cannot be variable names. This is the key difference between macro functions and SAS functions.
Your thoughts?
Do you find this post useful? Do you use SAS macro functions? Can you suggest other usage examples? Please share with us in the Comments below.
Additional resources
- Passing comma-delimited values into SAS macros and macro functions
- CALL EXECUTE made easy for SAS data-driven programming
- Data-driven SAS macro loops
- How to Create Macro Variables and Use Macro Functions (SAS video tutorial)
Related blog post: Multi-purpose macro function for getting information about data sets
14 Comments
Great article as always, Leonid!
nicely explained Leonid, I was reminded of type 2 diabetes when you were explaining the type 1 & type 2 macros. thanks
Thank you, Charu! I am not sure whether there is any connection between type 2 macros and type 2 diabetes, all I know is that exercises will help to improve both... 🙂
Great Article Leonid,
Thanks for sharing this.
Thank you, Kalind, I am glad you liked it.
After wrestling with macros for a while I have settled on three types distinguished by a naming convention:
%proc_something() is syntactically equivalent to a PROC and does execute.
%SAS_something() or %stmt_something() is syntactically equivalent to a statement or part of a statement, and therefore does not execute.
%something() or %fsomething() is a function and is syntactically equivalent to a string, number or identifier, and therefore does not execute.
The second two follow your rules.
That leads me to an idea: why not explicitly declare a macro to be a pure function (only macro code) or any collection of SAS code? The macro compiler could police this and the result would be more robust code. The obvious syntax is %macro /function name or %macro /sub name. Even better would be to declare the function return type, but the SAS language itself is vague about this.
Thank you, Peter, for your comment and the idea, - I will ask our R&D to comment on it. Could you please provide a more detailed description of your idea / vision?
If I may, a small comment about "why not explicitly declare a macro to be a pure function (only macro code) or any collection of SAS code".
When you look at %macros, which are in their design wrappers for (any possible) SAS code, you can see that they are something far more general than just functions.
With the advent of DoSubL() function this distinction about "pure" macrocode vs. "not pure" is less and less relevant. You can embed all groups of datasteps and procedures inside a macro and still have it work as a "pure macro code". If you look at the example in my previous comment the macro presented there can be executed inside another datastep and it will behave like a "pure macro function" even tough it contains datastep code.
About "function return type" - since for the macro language everything it digest is a text (in fact the text of our code) also the result it returns is only a text (which is then compiled and eventually executed) 🙂
All the best
Bart
Thank you, Bart, for the follow-up. Your point is well taken. Still, in my view it makes sense to view macro function definition as a pure macro code. With or without dosubl(), all statements in SAS macro function definition must start with % (except that last calculated value that we pass back to SAS language compiler). That makes them "pure macro code".
Dosubl() function is not used in a macro function definition on its own, it can only be used within %sysfunc() macro function which makes its usage still within the definition of "pure macro code". My take on this is that using %sysfunc() in conjunction with dosubl() function provide a back door to execute some SAS or even non-SAS (with proper API) code using a spawned SAS session and return back to the macro processor before it passes its generated code to the SAS programming language compiler of the main session. But still having % sign in front of the %sysfunc(dosubl()) construct makes it macro language syntax.
Bart,
Could that macro use data set variables as parameter?
Say, I have a dataset DATASETS that has a variable DS with 2 level dataset names.
Could I then have something like:
data _null_; length d 8; set datasets; d = %nobs(dset=<use the DS variable>, result=NOBS); put d=; run;
Thanks, Lex
Hi Lex!
With your example the setup would be probably something like this (I added line `options nonotes nosource %str(;)` to the macro to make it less "talkative"):
With such one the code you would like to run should be adjusted to the following form:
The Resolve() function executes macro and returns the value of `&&&result.`. Without the Resolve() the result of the `%nobs` would be just a static value, e.g. 42, which wouldn't give us a dynamic approach.
But frankly speaking, for your task I would use either a "classical" approach with the OPEN(), CLOSE(), and ATTRN() functions:
or I would use proc SQL and the `dictionary.tables`:
because, from my (and others too) experience, the `DoSubL()` works very well in setup like `DoSUBL( DO-loop )` [a loop inside the dosubl] and is quite slow in setup like `DO-loop( DoSubL() )` [the dosubl inside a a loop]. And when you think about it, it's quite "expected" behaviour since each time DoSubL() is executed a "side session" is invoked.
Hope it helps!
All the best
Bart
Thanks, Bart!
Very helpful.
The example was just an example. There are a lot of possibilities!
Lex
Leonid,
Great article! (as usual)
Two observations if I may.
1) I would add that one significant difference between 4GL and macrolanguage is that you have static vs. dynamic language.
2) DoSubL() to the rescue:
All the best
Bart
Thank you, Bart, for your very informative comment. I love learning from my readers! Frankly, I did not know that dosubl() function can be used with %sysfunc(). Now I know, and that opens up a whole new dimension in developing macro functions. While this post was written at the introductory level, the technique you brought up deserves separate consideration.