Ten "one-liners" that create test matrices for statistical programmers


You've had a long day. You've implemented a custom algorithm in the SAS/IML language. But before you go home, you want to generate some matrices and test your program.

If you are like me, you prefer a short statement—one line would be best. However, you also want the flexibility to create large matrices to test the performance of the algorithm. And you'd like the matrices to be simple so that you can determine if the algorithm is working and debug it if necessary.

Last week's article on how to generate a large correlation matrix caused me to think about other quick and easy ways to generate matrices of an arbitrary size. Over the years I've developed favorite "test matrices" that I use to test algorithms, to write examples for my blog, or to answer questions on a discussion forum.

Here are ten one-line statements that generate numerical and character matrices of an arbitrary size. Most matrices have n rows and p columns. However, some functions generate square matrices, which are n x n.

  1. Constant matrix: The J function enables you to create a matrix with constant values. The syntax is
    const = j(n, p, value);
  2. Constant by row or column: The ROW and COL functions enable you to create a matrix where the ith row or column has the value i. The syntax is
    rows = row(j(n, p));
  3. Sequential and nonrepeating: The SHAPE function in conjunction with the index creation operator (:) enables you to create a matrix that contains sequential values. (You can also use the SHAPECOL function.) The syntax is
    seq = shape(1:n*p, n);
  4. Periodic: The MOD function returns the remainder when a number is divided by some value. When combined with the previous example, you obtain matrices for which the values repeat in a periodic manner. The syntax is
    mod = mod(shape(1:n*p, n), value);
  5. Symmetric matrices: The SQRSYM and SQRVECH functions enable you to create square symmetric matrices from a vector of values. The syntax is
    sym = sqrsym(1:n*(n+1)/2);
  6. Diagonally banded: The TOEPLITZ function enables you to create square banded matrices. The syntax is
    band = toeplitz(n:1);
  7. Magic squares: The MAGIC function (which requires SAS/IML 12.3) creates square matrices for which the sums of the rows, columns, and diagonals are equal. The syntax is
    magic = magic(n);
  8. Random integers: The SAMPLE function enables you to generate a random sample from a finite set. The syntax is
    samp = sample(1:value, p//n);
  9. Random values from a standard distribution: The RANDFUN function enables you to generate a matrix of random values from a standard distribution. The syntax is
    rand = randfun(n//p, "Normal");
  10. Character matrices: Many of the previous matrices can be modified to create character matrices. For any integer matrix, you can apply the MOD function to create integers in the range 1–26 that can be mapped to letters. The index creation operator (:) supports letters, so you can create a vector of all uppercase letters by using the syntax "A":"Z". Also, the SAMPLE function can create a random sample directly from any set of character values. The syntax is
    sampC = sample("A":"Z", p//n);

The following SAS/IML program generates examples of each of these matrices. You can change the values of n and p to create matrices of any size.

proc iml;
call randseed(12345);
n = 4;
p = 6;
value = 3;
/* constant and sequential matrices */
const = j(n, p, value);
rows = row(j(n, p));
cols = col(j(n,p));
seq = shape(1:n*p, n);
seq2 = shapecol(1:n*p, n);
mod = mod(shape(1:n*p, n), value);
/* square matrices */
symU = sqrvech(1:n*(n+1)/2);
symL = sqrsym(1:n*(n+1)/2);
band = toeplitz(n:1);
magic = magic(n);
/* random matrices */
samp = sample(1:value, p//n);
rand = randfun(n//p, "Normal");
/* character matrices */
letters = "A":"Z";
sampC = sample(letters, p//n);
idx = 1 + mod(0:n*p-1, ncol(letters));  /* values 1-26 */
modC = shape(letters[idx],  n);
print const, rows, cols, seq, seq2, mod,
      symU, symL, band, magic, samp, rand, 
      sampC, modC;

Do you have a favorite matrix that you use to test your programs? Leave a comment.


About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

1 Comment

Leave A Reply

Back to Top