I often use the SAS/IML language for simulating data with certain known properties. In fact, I'm writing a book called *Simulating Data with SAS*. When I simulate repeated measurements (sometimes called *replicated data*), I often want to generate an ID variable that identifies which measurement is associated with which subject in a simulated study.

Depending on the analysis that you are conducting, there are two ways to structure the values in the ID variable: sorted by subject or sorted by "time."

To be specific, suppose that you have four patients in a study. Some measurement (for example, their weight) is taken every week for three weeks. You can order the data according to either of the following columns:

In the preceding table, the first column would be appropriate for data that are sorted by patient. The second column would be used for data that are sorted by week.

### Create ID vectors in SAS/IML software

One way to create ID vectors in SAS/IML software is to use the REPEAT function. The REPEAT function creates a matrix from an input vector by repeating the vector a specified number of times horizontally and vertically. For example, the expression `T(1:N)` is a column vector with `N` elements. You can create a matrix with `N` rows and `k` columns as follows:

proc iml; N=4; k=3; r = repeat(T(1:N),1,k); print r; |

Because PROC IML stores matrices in row-major order, you can call the COLVEC function to create a column vector that contains the ID values sorted by subject, as shown in the first column of Figure 1. The following SAS/IML function encapsulates this idea:

start ReplID(N, numRepl); return( colvec(repeat(T(1:N),1,numRepl)) ); finish; Subject = ReplID(4, 3); |

In a similar way, if you do NOT transpose the expression `1:N`, then the REPEAT function will repeat the values `1,2,3,...,1,2,3,...`. Once again, you can use the COLVEC function to convert that sequence of values into a column vector, as shown in the second column of Figure 1. The following SAS/IML function encapsulates this idea:

start ReplIDBlock(N, numRepl); return( colvec(repeat(1:N,1,numRepl)) ); finish; Time = ReplIDBlock(4, 3); |

I find that I use the first approach more often than the second.