CALL EXECUTE made easy for SAS data-driven programming

8

In my prior posts (Data-driven SAS macro loops, Modifying variable attributes in all datasets of a SAS library, Automating the loading of multiple database tables into SAS tables), I presented various data-driven applications using SAS macro loops.

However, macro loops are not the only tools available in SAS for developing data-driven programs.

CALL EXECUTE is one of them. The CALL EXECUTE routine accepts a single argument that is a character string or character expression. The character expression is usually a concatenation of strings containing SAS code elements to be executed after they have been resolved. Components of the argument expression can be character constants, data step variables, macro variable reference, as well as macro references. CALL EXECUTE dynamically builds SAS code during DATA step iterations; that code executes after the DATA step’s completion outside its boundary. This makes a DATA step iterating through a driver table an effective SAS code generator similar to that of SAS macro loops.

However, the rather peculiar rules of the CALL EXECUTE argument resolution may make its usage somewhat confusing. Let’s straighten things out.

Argument string has no macro or macro variable reference

If an argument string to the CALL EXECUTE contains SAS code without any macro or macro variable references, that code is simply pushed out (of the current DATA step) and appended to a queue after the current DATA step. As the DATA step iterates, the code is appended to the queue as many times as there are iterations of the DATA step. After the DATA step completes, the code in the queue gets executed in the order of its creation (First In First Out).

The beauty of this process is that the argument string can be a concatenation of character constants (in single or double quotes) and SAS variables which get substituted with their values by CALL EXECUTE for each DATA step iteration. This will produce data-driven, dynamically generated SAS code just like an iterative SAS macro loop.

Let’s consider the following example. Say we need to load multiple Oracle tables into SAS tables.

Step 1. Creating a driver table

In order to make our process data-driven, let’s first create a driver table containing a list of the table names that needed to be extracted and loaded:

/* create a list of tables to extract & load */
libname parmdl '/sas/data/parmdata';
data parmdl.tablelist;
        length tname $8;
        input tname;
        datalines;
ADDRESS
ACCOUNT
BENEFIT
FINANCE
HOUSING
;

This program runs just once to create the driver table parmdl.tablelist.

Step 2. Loading multiple tables

Then, you can use the following data-driven program that runs each time you need to reload Oracle tables into SAS:

/* source ORACLE library */
libname oralib oracle path="xxx" schema="yyy" user="uuu"
 	PASSWORD="{SAS002}ABCDEFG12345678RTUR" access=readonly;
 
/* target SAS library */
libname sasdl '/sas/data/appdata';
 
/* driver table SAS library */
libname parmdl '/sas/data/parmdata';
 
data _null_;
   set parmdl.tablelist;
   call execute(cats(
      'data sasdl.',tname,';',
         'set oralib.',tname,';',
      'run;'));
run;

In order to concatenate the components of the CALL EXECUTE argument I used the cats() SAS function which returns a concatenated character string removing leading and trailing blanks.

When this program runs, the SAS log indicates that after the data _null_ step the following statements are added and executed:

NOTE: CALL EXECUTE generated line.
1   + data sasdl.ADDRESS;set oralib.ADDRESS;run;
2   + data sasdl.ACCOUNT;set oralib.ACCOUNT;run;
3   + data sasdl.BENEFIT;set oralib.BENEFIT;run;
4   + data sasdl.FINANCE;set oralib.FINANCE;run;
5   + data sasdl.HOUSING;set oralib.HOUSING;run;

In this example we use data _null_ step to loop through the list of tables (parmdl.tablelist) and for each value of the tname column a new data step gets generated and executed after the data _null_ step. The following diagram illustrates the process:

Diagram explaining CALL EXECUTE for SAS data-driven programming

Argument string has macro variable reference in double quotes

If an argument to the CALL EXECUTE has macro variable references in double quotes, they will be resolved by the SAS macro pre-processor during the DATA step compilation. Nothing unusual. For example, the following code will execute exactly as the above, and macro variable references &olib and &slib will be resolved to oralib and sasdl before CALL EXECUTE takes any actions:

%let olib = oralib;
%let slib = sasdl;
 
data _null_;
   set parmdl.tablelist;
   call execute (
      "data &slib.."!!strip(tname)!!';'
         "set &olib.."!!strip(tname)!!';'!!
      'run;'
   );
run;

Argument string has macro or macro variable reference in single quotes

Here comes the interesting part. If the argument to CALL EXECUTE has macro or macro variable references in single quotes, they still will be resolved before the code is pushed out of the DATA step, but not by the SAS macro pre-processor during the DATA step compilation as it was in the case of double quotation marks. Macro or macro variable references in single quotes will be resolved by CALL EXECUTE itself. For example, the following code will execute exactly as the above, but macro variable references &olib and &slib will be resolved by CALL EXECUTE:

%let olib = oralib;
%let slib = sasdl;
 
data _null_;
   set parmdl.tablelist;
   call execute('data &slib..'!!strip(tname)!!';'!!
                'set &olib..'!!strip(tname)!!';'!!
                'run;'
               );
run;

Timing considerations

CAUTION: If your macro contains some non-macro language constructs for assigning macro variables during run time, such as a CALL SYMPUT or SYMPUTX statement (in a DATA step) or an INTO clause (in PROC SQL), resolving those macro variable references by CALL EXECUTE will happen too soon, before your macro-generated code gets pushed out and executed. This will result in unresolved macro variables. Let’s run the following code that should extract Oracle tables into SAS tables as above, but also re-arrange column positions to be in alphabetical order:

%macro onetable (tblname);
   proc contents data=oralib.&tblname out=one(keep=name) noprint;
   run;
 
   proc sql noprint;
      select name into :varlist separated by ' ' from one;
   quit;
   %put &=varlist;
 
   data sasdl.&tblname;
      retain &varlist;
      set oralib.&tblname end=last nobs=n;
      if last then call symput('n',strip(put(n,best.)));
   run;
   %put Table &tblname has &n observations.;
%mend onetable;
 
data _null_;
   set parmdl.tablelist;
   call execute('%onetable('!!strip(tname)!!');');
run;

Predictably, the SAS log will show unresolved macro variable references, such as:

WARNING: Apparent symbolic reference VARLIST not resolved.
WARNING: Apparent symbolic reference N not resolved.
Table ADDRESS has &n observations.

SOLUTION: To avoid the timing issue when a macro reference gets resolved by CALL EXECUTE too soon, before macro variables have been assigned during macro-generated step execution, we can strip CALL EXECUTE of the macro resolution privilege. In order to do that, we can mask & and % characters using the %nrstr macro function, thus making CALL EXECUTE “macro-blind,” so it will push the macro code out without resolving it. In this case, macro resolution will happen after the DATA step where CALL EXECUTE resides. If an argument to CALL EXECUTE has a macro invocation, then including it in the %nrstr macro function is the way to go. The following code will run just fine:

data _null_;
   set parmdl.tablelist;
   call execute('%nrstr(%onetable('!!strip(tname)!!'));');
run;

When this DATA step runs, the SAS log indicates that the following statements are added and executed:

NOTE: CALL EXECUTE generated line.
1   + %onetable(ADDRESS);
2   + %onetable(ACCOUNT);
3   + %onetable(BENEFIT);
4   + %onetable(FINANCE);
5   + %onetable(HOUSING);

CALL EXECUTE argument is a SAS variable

The argument to CALL EXECUTE does not necessarily have to contain or be a character constant. It can be a SAS variable, a character variable to be precise. In this case, the behavior of CALL EXECUTE is the same as when the argument is a string in single quotes. It means that if a macro reference is part of the argument value it needs to be masked using the %nrstr() macro function in order to avoid the timing issue mentioned above.

In this case, the argument to the CALL EXECUTE may look like this:

arg = '%nrstr(%mymacro(parm1=VAL1,parm2=VAL2))';
call execute(arg);

Making CALL EXECUTE totally data-driven

In the examples above we used the tablelist driver table to retrieve values for a single macro parameter for each data step iteration. However, we can use a driver table not only to dynamically assign values to one or more macro parameters, but also to control which macro to execute in each data step iteration. The following diagram illustrates the process of totally data-driven SAS program:

Diagram explaining using CALL EXECUTE for SAS data-driven programming

Conclusion

CALL EXECUTE is a powerful tool for developing dynamic data-driven SAS applications. Hopefully, this blog post demonstrates its benefits and clearly explains how to avoid its pitfalls and use it efficiently to your advantage. I welcome your comments, and would love to hear your experiences with CALL EXECUTE.

Share

About Author

Leonid Batkhan

Leonid Batkhan, Ph.D. in Computer Science and Automatic Control Systems, has been a SAS user for more than twenty years. He came to work for SAS in 1995 and is currently a Senior Consultant with the SAS Federal Data Management and Business Intelligence Practice. During his career, Leonid has successfully implemented dozens of SAS applications and projects in various industries. All posts by Leonid Batkhan >>>

Related Posts

8 Comments

  1. Nice article!

    Sometimes I use this technique:

    /* _debug is a global macro variable */
    %if &_debug %then
    %do;
    filename _code "%sysfunc(pathname(savecode))/savedcode.sas";
    %end;
    %else %do;
    filename _code CATALOG "work.mycat.tmpsource.source";
    %end;

    data _null_;
    set mydata;
    file _code;
    ... put statements ...
    run;

    %include _code;

    filename _code clear;

    It is easier to review the generated code when things go wrong.

    Lex

    • Leonid Batkhan

      Thank you, Lex, for sharing your alternative approach. Could you please clarify why you store your generated SAS code in a file vs. catalog depending on a _debug mode (1/0)? Also, did you mean WORK when you placed savecode as an argument to the pathname() function?

      • It's for convenience. Sometimes you may want to review the code being generated. 'savecode' would be a library where I want this code to be saved. I could have used a complete or relative folder name, for example: filename _code "../code/savedcode.sas";
        Of course a relative path assumes that SAS starts in a specific folder.

        • Leonid Batkhan

          Thank you, Lex, for the clarification. I got it, you want to save your generated code in a location available to you outside of SAS as the WORK library will be wiped out when SAS session ends. But I am still not clear on storing your generated code in SAS catalog vs. file. Why do you prefer one way over another depending on debugging mode?

          • When I want to debug, I want to see the code outside of SAS. When I don't care about debugging, I store my code in a catalog, or I could have stored the code in a file in the WORK library. There is no strong preference here between file and catalog.

            • Leonid Batkhan

              Thank you again. As I understand it, file or catalog is not essential, what is important whether you store it in a permanent location or in a temporary location (e.g. WORK library) that gets wiped out at the end of SAS session.

  2. Hello Leonid, and thank you for covering my favorite toy in SAS! Look at this beauty, for example:
    filename dir pipe "dir /b &_data_raw.*.csv";

    data _null_;
    length filename $256;
    infile dir length= reclen;
    input filename varying256. reclen;

    call execute( cat( "PROC IMPORT DATAFILE= ""&_data_raw.", strip( filename ), """",
    " OUT= CRF.", scan( filename, 1, "." ),
    " DBMS= CSV REPLACE; GUESSINGROWS=3000; RUN;"
    )
    );
    run;

    The main obstacle to writing any generated code is, in my opinion, having a clear mental model of the code processing and understanding what data are available when. For example, if one puts CALL EXECUTE inside a macro, the macro is finished by the time of CALL EXECUTE() run so any additional SAS code generated by the macro will be read by SAS before any EXECUTE's code, so one cannot issue any finalization commands inside the macro, for example. You have another great example about SYMPUT'ting things too early for the macros to pick up.

    To help switching the mental focus, I typically write the generated code in ALL CAPS while the normal open code in lower case.

    BY groups help write more involved CALL EXECUTE code. For example, imagine you were provided a set of data with some silly suffixes attached to the variable names, e.g.
    FOO_111
    BAR_222
    while really all you need is FOO and BAR (true story). Now fun begins:

    data _null_;
    set sashelp.vcolumn;
    where libname = "CRF" and memtype= "DATA";
    by memname;

    if first.memname then
    call execute( cat( "DATA CRF.", left( memname ), ";",
    " SET CRF.", left( memname ), "; ",
    "RENAME "
    )
    );
    call execute( cat( name, " = ", scan( name, 1, "_" ), " " ) );

    if last.memname then
    call execute( "; RUN; " );
    run;

    so for each line in the dataset, it is just another BEFORE=AFTER pair for the normal RENAME operator what is preceded by DATA STEP opening for each new dataset and appended by semicolon and RUN at the dataset end.

    Another often-overlooked area to use CALL EXECUTE is COMPUTE blocks in PROC REPORT what might be worth a post on its own.

Leave A Reply

Back to Top