Understanding local and global variables in the SAS/IML language

6

The TV show Cheers was set in a bar "where everybody knows your name." Global knowledge of a name is appealing for a neighborhood pub, but not for a programming language. Most programming languages enable you to define functions that have local variables: variables whose names are known only inside the function. This article describes local and global variables in the SAS/IML language.

Scope: the environment where everybody knows your name

One of the features of the SAS/IML language is that you can create your own user-defined functions or subroutines that extend the capabilities of SAS. For brevity, this article discusses functions, but the same ideas apply to user-defined subroutines.

In computer science, the scope of a variable is the "context... in which a variable name... is valid and can be used." A variable inside a function usually has local scope, which means that the variable’s name is known inside the function, but not outside. Furthermore, modifying a local variable does not affect any variable outside the function. For example, the following SAS/IML program defines a variable y inside a function. The local variable is created when the function executes and vanishes when the function exits. Although a variable outside the function is also named y, the outer variable is not affected by running the function:

proc iml; 
start F1(x); /* a function with local variables */
   y = 2*x;                       /* y is local */
   print y[label="y inside function (local)"]; 
   return(1); 
finish;
 
y = 0; t=1:5;
v = F1(t);
print y[label="y outside function"];

The scope of the variables is shown in the following diagram. Three variables are known to the main program: y, t, and v. Inside the function, two names are known: x and y. The local variable named y is not related to the variable y in the main scope. They have the same name, but their scope is different.

There are two ways to enable a variable inside a function to affect variables outside the function.

Sometimes SAS/IML users ask if there is a third alternative. Programmers sometimes ask whether it is possible to create a variable that is shared between several functions, but is not global to the entire program. The answer is no. The SAS/IML language does not support a namespace or variables that are global to a namespace.

Parent variables: sharing the memory, but not the name

Because the SAS/IML language passes values by reference, modifying one of the function's arguments changes the value of the matrix that was passed in. The following example illustrates this:

start F2(x); /* a function that modifies its argument */
   x = 2*x;                       /* x is an argument */
   print x[label="x inside module (argument)"]; 
   return(2); 
finish;
 
y = 0; t=1:5;
v = F2(t);
print t[label="t at main scope"];

Notice that the variable t at main scope has a different name from the parameter x inside the module, but both variables share the same memory. Because x is an argument, changing x inside the function also changes t. See my previous article for more details about passing by reference. The behavior of the F2 function is summarized in the following diagram.

Global variables: sharing the memory and the name

A variable has global scope if you include it in the GLOBAL clause of the START statement. A variable that has global scope can be read or modified inside the module. It corresponds to a variable of the same name that exists at main scope. The following example illustrates this:

start F3(x) GLOBAL(y);  /* a function that has a global variable */
   y = 2*x;                                       /* y is global */
   print y[label="y inside module (global)"]; 
   return(3); 
finish;
 
y = 0; t=1:5;
v = F3(t);
print y[label="y at main scope"];

In this example, the y vector is changed from within the F3 function because y is declared to be a global variable. The following diagram illustrates the behavior of the F3 function.

The role of global variables in SAS/IML programs

Although global variables are discouraged in computer science courses, they serve an important purpose in SAS/IML programming. Namely, when you write an optimization program in the SAS/IML language, the function that is optimized (called the objective function) must contain only one argument. The argument vector is modified until the objective function reaches an optimal value. Any parameters to the objective function must be specified as global variables. The global variables are parameters that are held constant during the optimization process.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

6 Comments

  1. Rick:

    My students struggle with scope in SAS/IML, so this article will be useful to them. However your final (global) example has a typo in that your F3 function defines M
    y=3*x
    but your output and diagram depict the multiplier of x to be 2.

  2. Pingback: Local functions (not!) in the SAS/IML langauge - The DO Loop

  3. Pingback: Handling run-time errors in user-defined modules - The DO Loop

  4. Pingback: Ten tips before you run an optimization - The DO Loop

  5. Pingback: Everything you wanted to know about writing SAS/IML modules - The DO Loop

Leave A Reply

Back to Top