How to read data set variables into SAS/IML vectors

2

One of the first skills that a beginning SAS/IML programmer learns is how to read data from a SAS data set into SAS/IML vectors. (Alternatively, you can read data into a matrix). The beginner is sometimes confused about the syntax of the READ statement: do you specify the names of the variable in the data set, or the names of the SAS/IML vectors that you are trying to create?

The answer is "yes." J By default, SAS/IML creates vectors that have the same name as the data set variables that you specify on the READ statement. For example, if you want to read variables from the Sashelp.Class data set, use the VAR clause on the READ statement to specify the variable names, like so:

proc iml;
use Sashelp.Class;
read all var {sex height weight};

This code snippet creates three vectors (Sex, Height, and Weight) that each contain 19 rows, which is the number of observations in the data set. However, the question that I've been asked is "What exactly is that thingy between the curly braces?" Is it a list of vector names? Is it something else?

How the SAS/IML language treats character arrays

The answer is "It is a character vector that specifies the names of variables in the data set." The confusion arises because the "thingy between curly braces" doesn't have any quotation marks. However, in the SAS/IML language, characters inside of curly braces are transformed to upper-case strings. In other words, the following two statements are equivalent:

c = {sex height weight};    /* converted to upper case strings */
c = {"SEX" "HEIGHT" "WEIGHT"};

The first statement does not contain quotation marks, but the parser recognizes that it is an array of character values. Therefore, each character string is converted to upper case before it is stored in the vector c.

Specifying variable names

Because SAS variables are not case-sensitive, SAS software doesn't care how you specify the names of data set variables. Upper case, lower case, mixed case,...it's all the same to SAS. Consequently, the original SAS/IML statements are equivalent to the following statements:

read all var {"SEX" "Height" "WeIgT"}; /* names are not case sensitive */

In either case, the READ statement creates three vectors that have the same names as specified on the VAR clause.

Since all specifications read the same variables, you might wonder what label appears when the PRINT statement is used to display a vector. Upper case? Mixed case? The answer is that the PRINT statement uses the same case as the name of the variable in the data set. For this example, the variables in the Sashelp.Class data are mixed case with a leading upper-case letter, as shown in the output of the following PRINT statement:

print sex height weight; /* print in same case as in data set */

For more details on how to read data into SAS/IML variables, see my article "Reading SAS data sets."

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

2 Comments

  1. Alain Le Page on

    Hello Dr Wicklin,

    Imagine that I need to read two value from a sas table, then I add those two values. I wih to repeat this operation because I have many sas table to read.
    At the end, I will have the following matrice..

    v11 v12 v13
    v21 v22 v23
    v31 v32 v33
    .......and so on

    I have send those values in a 18 x 3 dimensions array, but when i put this array in a data table, sas put all the values on one line instead of keeping the original structure.
    Do you have any suggestion to active this task?

    Regards,

Leave A Reply

Back to Top