A frequently performed task in data analysis is identifying all the observations in a data set that satisfy certain conditions. For example, you might want to identify all of the female patients in your study or to identify all patients whose systolic blood pressure is greater than 140 mm Hg.

Novice programmers often use loops to find observations that satisfy a criterion. Don't do it! In SAS/IML software, there is almost never a good reason to loop over all observations. Instead, use the LOC function.

The following statements use the LOC function to identify the patients mentioned earlier:

```f = loc(gender = "female"); highBP = loc(systolic > 140);```

The LOC function is the most useful function in the SAS/IML language that DATA step programmers have never heard of. The LOC function finds the location of nonzero elements in a vector or matrix. Using the LOC function is much faster than writing a loop.

The LOC function returns a row vector that contains indices that satisfy the specified condition. You can use the indices to subset the data.

### Forming Subsets

For example, suppose you have data for some famous witches and wizards in literature:

```data MagicUsers; infile datalines dsd; length Name \$11 Profession \$7 Source \$20; input Name Profession Power Source; datalines; Morgana, Witch, 7, Authurian Legend Merlin, Wizard, 10, Authurian Legend Gryffindor, Wizard, 8, Harry Potter Books Hufflepuff, Witch, 8, Harry Potter Books Ravenclaw, Witch, 8, Harry Potter Books Slytherin, Wizard, 8, Harry Potter Books Glinda, Witch, 5, Oz Books Elphaba, Witch, 6, Oz Books Diggs, Wizard, 1, Oz Books ;```

The LOC function can help you determine which witch is which. The following statements determine which names correspond to witches:

```proc iml; use MagicUsers; read all var {Name Profession Power}; close MagicUsers;   ndx = loc(Profession="Witch"); /** find the indices for witches **/ Names = Name[ndx]; /** subset the names of the witches **/```

Similarly, if you want to compute the average power of these witches, you can form a subset of the Power variable and compute the mean power of the witches in the subset:

```WitchPower = Power[ndx]; /** subset the power variable **/ AvgPower = WitchPower[:]; /** compute the subset's average **/ print ndx[colname=Names], AvgPower;```
Share

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.