Macros and loops in the SAS/IML language

4

I am not a big fan of the macro language, and I try to avoid it when I write SAS/IML programs. I find that the programs with many macros are hard to read and debug. Furthermore, the SAS/IML language supports loops and indexing, so many macro constructs can be replaced by standard SAS/IML syntax.

Nevertheless, many SAS customers use macro constructs as part of their daily SAS programming tasks, and that practice often continues when they write SAS/IML programmers. A customer recently asked a question about the macro language that required knowledge of the way that macro variables are handled within a SAS/IML loop. This post shares my response.

Here's the crux of the customer's question. Run the following SAS/IML program and see if you can understand why it behaves as it does:

proc iml;
i = 7;
call symputx("j", i);    /* 1. Put value of i into macro variable j */
y1 = &j;                 /* 2. Assign y1 the value of &j            */
print y1;                /* success! */
 
y = j(1,4,.);
do i = 1 to ncol(y);     /* 3. Start processing the DO block of statements */
   call symputx("j", i); /* 4. Put value of i into macro variable j */
   y[i] = &j;            /* 5. Hmmmm, what does this do inside the loop? */
end;
print y;                 /* Not what you might expect? */

As you can see from the output, the first use of the macro variable (outside the DO loop), works as expected. But the second does not. The customer wanted to know why the elements of y are not set to 1, 2, 3, 4 within the loop.

The key point to remember about macro variables is that SAS code never sees them. Macro variables are evaluated by the macro preprocessor at parse time, not at run time. The SAS/IML code never sees &j, only the constant value that the preprocessor substitutes for &j.

It is also important to remember that PROC IML is an interactive procedure. (The "I" in IML stands for interactive!) Each statement or block of statements is parsed as it is encountered, as opposed to the DATA step, which parses the entire program before beginning execution.

Let's examine the program step-by-step to understand why the first construct works but the second does not. The following steps refer to the numbers in the program comments:

  1. The value of the SAS/IML scalar i is copied (as text) into the macro variable j.
  2. The statement is encountered. The value of the macro variable j is substituted by the macro preprocesser. Then the statement is executed. The SAS/IML variable y1 is assigned to the value 7.
  3. A DO loop is encountered by the SAS/IML parser. The parser finds the matching END statement and proceeds to parse the entire body of the loop in order to check for syntax errors. This parsing phase occurs exactly one time. Because the block of statements contain a macro variable, the macro preprocessor substitutes the value of the macro variable j, which is 7.
  4. For each iteration, the value of the SAS/IML scalar i is copied (as text) into the macro variable j.
  5. For each iteration, the ith element of the y vector is assigned the value 7. In particular, this statement does not contain a reference to the macro varible j.

To the casual reader of the program, it looks like &j will have a different value during each step of the iteration. But but it doesn't. The expression &j is resolved at parse time. SAS/IML parses the entire body of the DO loop once, before any execution occurs, and at parse time the expression &j is 7.

There is a way to get what the customer wants. The SYMGET function retrieves the value of a macro variable at run time. Therefore the following statements fill the vector y with the values 1 through 4:

do i = 1 to ncol(y);
   call symputx("j", i);
   y[i] = num(symget("j"));  /* get macro value at run time */
end;
print y;                     /* Yes! This is what we want! */

For me, this blog post emphasizes three facts:

  • Always remember that macro substitution is done by a preprocessor, which operates at parse time.
  • The SAS/IML language parses an entire block of statements (between the DO and END statements) one time before executing the block.
  • Mixing macro code and SAS/IML statements can be confusing and hard to debug. When you have the option, use SAS/IML language features instead of relying on macro language constructs.
Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

4 Comments

  1. Pingback: Read data sets that are specified by an array of names - The DO Loop

  2. Great explanation of a difficult concept to grasp. The overall theme of confusion between macro pre-processing and run time execution also exists with consumers of the Data Step, so this entry is useful for SAS users outside of PROC IML as well.

    Thanks!

  3. Pingback: The best articles of 2013: Twelve posts from The DO Loop that merit a second look - The DO Loop

  4. Pingback: Calling a global statement inside a loop - The DO Loop

Leave A Reply

Back to Top