Find the location of a run-time error in a user-defined function in SAS

0

Every programmer makes errors. Therefore, learning to debug a program is an important part of learning to program. Another skill is learning to decipher cryptic error messages, which can be as hard to interpret as hieroglyphs. One helpful skill is learning to navigate a "traceback" error. A traceback error message is displayed when there is a sequence of function calls that are involved in the error. This article discusses traceback errors in user-defined functions. In SAS, you can define functions in PROC FCMP and in PROC IML. PROC IML provides traceback information, but the DATA step does not.

Traceback in SAS IML for functions that are defined in the program

I am going to start with PROC IML because it enables you to define functions in the same program that runs the functions. This is in contrast to PROC FCMP and the DATA step, where you must first define and store the functions by using PROC FCMP and later load and use the functions in the DATA step. As you will see, PROC IML can provide more information about errors in functions that are defined locally.

The following program defines two functions. The first one is named MOD1. On Line 3, the function calls the second function, which is MOD2, and passes in the same parameter that was sent to MOD1. In MOD2, the function passes the parameter to the LOG function on Line 4. Outside of the functions, the program first calls MOD1 with the parameter +1, which does not cause an error. The program then calls MOD1 with the parameter -1, which causes an error inside the MOD2 function because LOG(-1) is an invalid call.

proc iml;
/* COLUMNS:
0        1         2         3         4
1234567890123456789012345678901234567890  */
start mod1(x);      /* Line 1 */
  y=1;              /* Line 2 */
  w=mod2(x);        /* Line 3 */
  sum = x + y + w;
  return sum;
finish;
 
/* this function produces as ERROR if x <= 0 */
start mod2(x);      /* Line 1 */
  a=1;              /* Line 2 */
  b=2;              /* Line 3 */
  c=log(x);         /* Line 4 */
  d=4;
  return a+b+c+d;
finish;
 
q1 = mod1(1);    /* no error */
q2 = mod1(-1);   /* produces an error */

Let's run this code through SAS Studio and display the SAS log that results:

84   proc iml;
NOTE: IML Ready
85   /* COLUMNS:
86   0        1         2         3         4
87   1234567890123456789012345678901234567890  */
88   start mod1(x);
88 !                     /* Line 1 */
89     y=1;
89 !                     /* Line 2 */
90     w=mod2(x);
90 !                     /* Line 3 */
91     sum = x + y + w;
92     return sum;
93   finish;
NOTE: Module MOD1 defined.
94   
95   /* this function produces as ERROR if x <= 0 */
96   start mod2(x);
96 !                     /* Line 1 */
97     a=1;
97 !                     /* Line 2 */
98     b=2;
98 !                     /* Line 3 */
99     c=log(x);
99 !                     /* Line 4 */
100    d=4;
101    return a+b+c+d;
102  finish;
NOTE: Module MOD2 defined.
103  
104  q1 = mod1(1);
104!                  /* no error */
105  q2 = mod1(-1);
ERROR: (execution) Invalid argument to function.           [Error msg: Line 1]
 operation : LOG at line 99 column 8                       [Error msg: Line 2]
 operands  : x                                             [Error msg: Line 3]
x      1 row       1 col     (numeric)                     [Error msg: Line 4]
        -1                                                 [Error msg: Line 5]
 statement : ASSIGN at line 99 column 3                    [Error msg: Line 6]
 traceback : module MOD2 at line 99 column 3               [Error msg: Line 7]
             module MOD1 at line 90 column 3               [Error msg: Line 8]
NOTE: Paused in module MOD2.                               [Error msg: Line 9]

The ERROR message appears just after Line 105 in the program source. (Use OPTIONS SOURCE if you do not see the source code in the log!) The bracketed text ("[Error msg: Line 1]") was added by me and does not appear in the actual SAS log. That added text corresponds with the explanations in the following list:

  1. There is a run-time error in an argument to a function.
  2. The function's name is LOG. The error occurs on line 99 of the program in the SOURCE of the log. This will not correspond to lines in your editor because SAS Studio submits hidden source code prior to submitting your program.
  3. The argument to the LOG function is a variable (or symbol) named x.
  4. The x symbol is a 1x1 numeric matrix.
  5. The value of the x matrix is -1.
  6. The error occurs during an assignment statement, which means it is of the form c=LOG(x).
  7. The error occurs in the module MOD2 on Line 99. This is the first line of the traceback.
  8. The previous module was called from the module MOD1. The call is on Line 90.
  9. Because of the error, the module MOD2 paused and did not return.

These lines tell the complete story of why the error occurs and where the error occurred in the program. Specifically, items 6, 7, and 8 provide the traceback (sometimes called a trackback) for the sequence of calls that are involved in the error. The error message tells you that the call to LOG failed, which is called inside MOD2, which is called inside MOD1. This is important information for debugging the problem because it tells you to investigate MOD1 and examine what parameters are used to call MOD1.

How tracebacks change for stored IML functions

Suppose instead that you store these functions and load and call them later. Often the modules have been stored in a previous SAS session, or even long ago by someone else at your company. In that case, SAS cannot provide as much information about the traceback. For example, it cannot provide tips such as "line 99" or "line 90" because that information is not stored (and it wouldn't be helpful to you anyway because the log is long gone).

Let's see what the traceback looks like if we store/load these functions and call them in the same way:

...
store module=(mod1 mod2);        /* store for future use */
QUIT;
 
proc iml;
load module=(mod1 mod2);         /* load from storage */
q1 = mod1(1);                    /* no error */
q2 = mod1(-1);                   /* produces an error */

Because SAS IML stores the function definitions in a compiled form, the error message has less information. The log can no longer report the line number of the statement in the source code that creates the error. However, it can still report a traceback and give you information about the reason for the error relative to the first line of each function definition. The relative line number is called the offset in the error message:

106  proc iml;
NOTE: IML Ready
107  load module=(mod1 mod2);
NOTE: Opening storage library WORK.IMLSTOR
108  q1 = mod1(1);
109  q2 = mod1(-1);
ERROR: (execution) Invalid argument to function.            [Error msg: Line 1]
 operation : LOG at offset   4 column   8                   [Error msg: Line 2]
 operands  : x                                              [Error msg: Line 3]
x      1 row       1 col     (numeric)                      [Error msg: Line 4]
        -1                                                  [Error msg: Line 5]
 statement : ASSIGN at offset   4 column   3                [Error msg: Line 6]
 traceback : module MOD2 at offset   4 column   3           [Error msg: Line 7]
             module MOD1 at offset   3 column   3           [Error msg: Line 8]
NOTE: Paused in module MOD2.                                [Error msg: Line 9]

The error message is almost the same, but now offsets are used instead of absolute line numbers:

  1. There is a run-time error in an argument to a function.
  2. The function's name is LOG. The error occurs on the fourth line of some function.
  3. The argument to the LOG function is a variable (or symbol) named x.
  4. The x symbol is a 1x1 numeric matrix.
  5. The value of the x matrix is -1.
  6. The error occurs during an assignment statement, which means it is of the form c=LOG(x).
  7. The error occurs on the fourth line of the MOD2 function.
  8. The previous module was called from the third line of the MOD1 function.
  9. Because of the error, the module MOD2 paused and did not return.

In structure, the two messages are very similar. Only the tracebacks differ. In the first example, absolute line numbers were used to identify the statements that were involved in the error. In the second example, relative line numbers (offsets) were used instead. These numbers require that you go back to the original module definition and count down from the START statement until you reach the line that is involved in the error.

No tracebacks for FCMP functions

You can repeat this example for functions that are defined by using PROC FCMP and that are called from a DATA step program. In the following example, I named the functions FUNC1 and FUNC2, but otherwise the structure of the program is the same:

/* Nest functions in FCMP */
/* you can also define a function that depends on parameters */
proc fcmp outlib=work.funcs.traceb;
/* COLUMNS:
0        1         2         3         4
1234567890123456789012345678901234567890  */
   function func1(x);   /* Line 1 */
      y=1;              /* Line 2 */
      w=func2(x);       /* Line 3 */
      sum=x + y + w;
      return( sum );
   endsub;
 
   /* this function produces as ERROR if x <= 0 */
   function func2(x);   /* Line 1 */
      a=1;              /* Line 2 */
      b=2;              /* Line 3 */
      c=log(x);         /* Line 4 */
      d=4;
      return( a+b+c+d );
   endsub;
quit;
options cmplib=work.funcs;   /* DATA step will look here for unresolved functions */
 
data Test;
input x @@;
y = func1(x);
datalines;
3 2 1 0
;

The DATA step calls the FUNC1 function for several values of x. The call will fail when x = 0. When you run the SAS program, the log shows certain information about the error and the traceback. Again, I will augment the log by adding bracketed text that corresponds to items in a list.

NOTE: Function func2 saved to work.funcs.traceb.
NOTE: Function func1 saved to work.funcs.traceb.
NOTE: PROCEDURE FCMP used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
 
106  options cmplib=work.funcs;   /* DATA step will look here for unresolved functions */
107  
108  data Test;
109  input x @@;
110  y = func1(x);
111  datalines;
ERROR: An invalid argument is used in the function call in 
       function 'func2' in statement number 4 at line 7 column 8.    [Error msg: Line 1]
       The statement was:                                            [Error msg: Line 2]
    0     (7:8)      c = LOG( x=0 )                                  [Error msg: Line 3]
ERROR: Exception occurred during subroutine call.                    [Error msg: Line 4]
RULE:----+----1----+----2----+----3----+----4----+----5              [Error msg: Line 5]             
112  3 2 1 0                                                         [Error msg: Line 6]
x=0 y=. _ERROR_=1 _N_=4                                              [Error msg: Line 7]
  1. The first ERROR statement tells you that there was an invalid argument for the function FUNC2. The call was on statement number 4. If you look at the definition of the function, you see that the fourth statement is c=log(x). However, the message also says 'line 7', which you might find confusing. The line number is bigger than the statement number because FCMP classifies comments as statements. If you look carefully into how SAS stores FCMP functions, you will learn more. You can also run the statements proc print data=work.funcs; run; to see the association between line numbers and statements.
  2. The second line is a label.
  3. The third line tells you that the error occurs in the call C=LOG(x) when x=0.
  4. The fourth line tells you that an exception (a domain error in a function causes a floating point exception) occurred during a subroutine call. I think it is referring to the FUNC2 function, but I am not sure.
  5. The fifth line is a column ruler. The RULE appears when the log is going to show you data in the DATALINES block of a DATA step. It helps you count column numbers in case you are reading data from certain columns.
  6. The sixth line is the DATALINES data that was read when the error occurred.
  7. The seventh line is the values of the DATA step variables when the error occurred. The variables x and y are the data set variables. The _ERROR_ and _N_ variables are automatic variables.

Do you see anything that is missing? Yes! There is no traceback information. The error statements do not tell you that the error occurred while calling FUNC2 from FUNC1. It doesn't mention FUNC1 at all. If you have multiple functions that each call FUNC2, you might not know which call resulted in the error.

Summary

This article show how to interpret a run-time error in the SAS log. This is an important skill for SAS programmers to learn. If the error occurs in a chain of nested function calls, SAS might output a traceback. A traceback reports the source of the error and also the functions that are involved in the chain of function calls that resulted in the error. This can be useful for debugging errors.

PROC IML provides traceback information for user-defined functions. You get more information if the functions are defined "inline" (as opposed to stored and loaded), so if you are debugging a nasty bug, you might try to define the relevant functions at the top of your IML program.

Unfortunately, the situation is not so nice for FCMP functions. As far as I know, you do not get any traceback information for errors that occur in nested function calls. You do, however, get information about the source of the error in the lowest-level function.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Leave A Reply

Back to Top