Reading SAS data sets


Often, the first step of a SAS/IML program is to use the USE, READ, and CLOSE statements to read data from a SAS data set into a vector or matrix. There are several ways to read data:

  • Read variables into vectors of the same name.
  • Read one or more variables of the same type (numeric or character) into a matrix.
  • Read all variables of a certain type into a matrix.

The Sashelp.Class data set is distributed with SAS software. The data set contains 19 observations, two character variables (NAME and SEX), and three numeric variables (AGE, HEIGHT, and WEIGHT). The following statements start PROC IML and open the data set for reading:

proc iml;
use Sashelp.Class;

Read Variables into Vectors

You can read a specified set of variables into vectors that have the same name as the variables. For example, the following statement creates the column vectors Sex and Height:

/** read specific variables into vectors **/
read all var {Sex Height};

The ALL keyword refers to the number of observations to be read. Consequently, the preceding statement reads all observations for the variables SEX and HEIGHT into the column vectors Sex and Height.

You can also specify the variable names in a character matrix:

varNames = {"Age" "Weight"};
read all var varNames;
The preceding statement reads all observations for the variables AGE and WEIGHT into the column vectors Age and Weight.

Read Variables into a Matrix

For regression and other multivariate analyses, it is convenient to read data from a set of variables into columns of a matrix. Use the INTO keyword for this. For example, if XVarNames is a character matrix that contains the name of explanatory variables and YVarName is a character matrix that contains the name of a response variable, then the following statements read the specified variables into matrices x and y:

XVarNames = {"Age" "Height"};
YVarName = {"Weight"};
read all var XVarNames into x;
read all var YVarName into y;

Read All Numeric (or Character) Variables

If you want to read all numeric variables into a matrix, you can use the _NUM_ keyword. You can get the names of the numeric variables by specifying the COLNAME= option after the matrix name, as follows:

read all var _NUM_ into z[colname=NumNames];

The preceding statement reads the numeric variables AGE, HEIGHT, and WEIGHT into the matrix z and also creates a character matrix, NumNames, that contains the names of those variables. (To read all character variables, use the _CHAR_ keyword.)

When you are finished reading the data, remember to close the data set.

For further details, see the "Language Reference" chapter in the SAS/IML User's Guide.


About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Back to Top