Children in primary school learn that every positive number has a real square root.
The number *x* is a square root of *s*, if *x*^{2} = *s*.

Did you know that matrices can also have square roots? For certain matrices S, you can find another matrix X such that X*X = S. To give a very simple example, if S = a*I is a multiple of the identity matrix and a > 0, then X = ±sqrt(a)*I is a square root matrix.

### A matrix square root

I'm going to restrict this article to real numbers, so from now on when I say "a number" I mean a real number and when I say "a matrix" I mean a real matrix. Negative numbers do not have square roots, so it is not surprising that not every matrix has a square root. For example, the negative identity matrix (-I) does not have a square root matrix.

All positive numbers have square roots, and mathematicians, who love to generalize everything, have defined a class of matrices with properties that are reminiscent of positive numbers. They are called *positive definite matrices*, and they arise often in statistics because every covariance and correlation matrix is symmetric and positive definite (SPD).

It turns out that if S is a symmetric positive definite matrix, then there exists a unique SPD matrix X, (called the square root matrix) such that X^{2} = A. For a proof, see Golub and van Loan, 3rd edition, 1996, p. 149.
Furthermore, the following iterative algorithm converges quadratically to the square root matrix (*ibid*, p. 571):

- Let X
_{0}=I be the first approximation. - Apply the formula
*X*_{1}= (*X*_{0}+*S***X*_{0}^{-1}) / 2 where X^{-1}is the inverse matrix of X. The matrix*X*_{1}is a better approximation to sqrt(S) than X_{0}. - Apply the formula
*X*_{n+1}= (*X*_{n}+*S***X*_{n}^{-1}) / 2 iteratively until the process converges. Convergence is achieved when a matrix norm ∥ X_{n+1}– X_{n}∥ is as small as you like.

The astute reader will recognize this algorithm as the matrix version of the Babylonian method for computing a square root. As I explained last week, this iterative method implements Newton's method to find the roots of the (matrix) function f(X) = X*X - S.

### Compute a matrix square root in SAS

In SAS, the SAS/IML matrix language is used to carry out matrix computations. To illustrate the square root algorithm, let S be the 7x7 Toeplitz matrix that is generated by the vector {4 3 2 1 0 -1 -2}. I have previously shown that this Toeplitz matrix (and others of this general form) are SPD. The following SAS/IML program implements the iterative procedure for computing the square root of an SPD matrix:

proc iml; /* Given an SPD matrix S, this function to compute the square root matrix X such that X*X = S */ start sqrtm(S, maxIters=100, epsilon=1e-6); X = I(nrow(S)); /* initial starting matrix */ do iter = 1 to maxIters while( norm(X*X - S) > epsilon ); X = 0.5*(X + S*inv(X)); /* Newton's method converges to square root of S */ end; if norm(X*X - S) <= epsilon then return X; else return(.); finish; S = toeplitz(4:-2); /* 7x7 SPD example */ X = sqrtm(S); /* square root matrix */ print X[L="sqrtm(S)" format=7.4]; |

The output shows the square root matrix. If you multiply this matrix with itself, you get the original Toeplitz matrix. Notice that the original matrix and the square root matrix can contain negative elements, which shows that "positive definite" is different from "has all positive entries."

### Functions of matrices

The square root algorithm can be thought of as a mapping that takes an SPD matrix and produces the square root matrix. Therefore it is an example of a function of a matrix. Nick Higham (2008) has written a book
*Functions of Matrices: Theory and Computation*, which contains many other functions of matrices (exp(), log(), cos(), sin(),....) and how to compute the functions accurately. SAS/IML supports the EXPMATTRIX function, which computes the exponential function of a matrix.

The square root algorithm in this post is a simplified version of a more robust algorithm that has better numerical properties. Higham (1986), "Newton's Method for the Matrix Square Root" cautions that Newton's iteration can be numerically unstable for certain matrices. (For details, see p. 543, Eqns 3.11 and 3.12.) Higham suggests an alternate (but similar) routine (p. 544) that is only slightly more expensive but has improved stability properties.

I think it is very cool that the ancient Babylonian algorithm for computing square roots of numbers can be generalized to compute the square root of a matrix. However, notice that there is an interesting difference. In the Babylonian algorithm, you are permitted to choose any positive number to begin the square root algorithm. For matrices, the initial guess must commute with the matrix S (Higham, 1986, p. 540). Therefore a multiple of the identity matrix is the safest choice for an initial guess.