Random access: How to read specific observations in SAS/IML software

1

This article shows how to randomly access data in a SAS data set by using the READ POINT statement in SAS/IML software. I have previously discussed how to use the READ NEXT and READ CURRENT statements to sequentially access each observation in a SAS data set from PROC IML.

Reading a Specified Observation: The READ POINT Statement

In SAS/IML software, you can directly access rows of data by using the READ POINT statement. (The usage is similar to the POINT= option in the SET statement of the SAS DATA step, except in the SAS/IML language you do not need to use a STOP statement.) The value after the POINT keyword can be a scalar or a matrix of values.

In the following program, the USE statement opens the SasHelp.Class data set for reading. The DO statement loops five times. At each iteration, the variable r contains a valid row number. The READ POINT statement reads the specified observation for the EngineSize variable into a scalar SAS/IML variable with the same name. You could then do something with that observation, such as predict the gas consumption for that vehicle.

proc iml;
/** show how to use READ POINT **/
p = {139, 250, 80, 388, 185};
use sashelp.cars; 
do i = 1 to 5; 
   r = p[i];
   read point r var {EngineSize};
   /** compute with this observation **/
end;

It is not strictly necessary to create the temporary variable, r. The READ statement could be written more concisely as follows:

   read point (p[i]) var {EngineSize};

Reading All Specified Observations at the Same Time

Actually, there is no need for the DO loop in the previous program. The SAS/IML language accepts a vector of values for the argument to the POINT option. In other words, you can ask for the values of the observations enumerated in p with a single statement:

/** get a vector of values **/
read point p var {EngineSize};
print p EngineSize;

In conclusion, the READ POINT statement enables random access to observations in a data set. This means that you can read the observations in any order. This article also shows that you can read the observations one at a time within a loop, or all at once with a single statement. In general, random access is not as efficient as sequential access of a data set, so sequential access is preferred for most applications.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

1 Comment

  1. Pingback: Reading big data in the SAS/IML language - The DO Loop

Leave A Reply

Back to Top