A matrix is a convenient way to store an array of numbers. However, often you need to extract certain elements from a matrix. The SAS/IML language supports two ways to extract elements: by using subscripts or by using indices. Use subscripts when you are extracting a rectangular portion of a
Author
Sometimes you are writing a program that needs to find out whether a particular SAS product (like SAS/ETS, SAS/QC, or SAS/OR) is licensed. I was reminded of this fact when I wrote last week's blog post about how to create a map with PROC SGPLOT. Although the SGPLOT procedure is
Did you know that you can use the POLYGON statement in PROC SGPLOT to draw a map? The graph at the left shows the 48 contiguous states of the US, overlaid with markers that indicate the locations of major cities. The plot was created by using the POLYGON statement, which
In many procedures, the ID statement is used to identify observations by specifying an identifying variable, such as a name or a patient ID. In many regression procedures, you can specify multiple ID variables, and all variables are copied into output data sets that contain observation-wise statistics such as predicted
How much does this big pumpkin weigh? One of the cafeterias at SAS invited patrons to post their guesses on an internal social network at SAS. There was no prize for the correct guess; it was just a fun Halloween-week activity. I recognized this as an opportunity to apply the
In SAS, the DATA step and PROC SQL support mnemonic logical operators. The Boolean operators AND, OR, and NOT are used for evaluating logical expressions. The comparison operators are EQ (equal), NE (not equal), GT (greater than), LT (less than), GE (greater than or equal), and LE (less than or
Statistical programmers often need to evaluate complicated expressions that contain square roots, logarithms, and other functions whose domain is restricted. Similarly, you might need to evaluate a rational expression in which the denominator of the expression can be zero. In these cases, it is important to avoid evaluating a function
Sometimes I can't remember where I put things. If I lose my glasses or garden tools, I am out of luck. But when I can't remember where I put some data, I have SAS to help me find it. When I can remember the name of the data set, my
Every year near Halloween I write a trick-and-treat article in which I demonstrate a simple programming trick that is a real treat to use. This year's trick features two of my favorite functions, the CUSUM function and the LAG function. By using these function, you can compute the rows of
The FREQ procedure in SAS supports computing exact p-values for many statistical tests. For small and mid-sized problems, the procedure runs very quickly. However, even though PROC FREQ uses efficient methods to avoid unnecessary computations, the computational time required by exact tests might be prohibitively expensive for certain tables. If
Did you know that the FREQ procedure in SAS can compute exact p-values for more than 20 statistical tests and statistics that are associated with contingency table? Mamma mia! That's a veritable smorgasbord of options! Some of the tests are specifically for one-way tables or 2 x 2 tables, but many apply
How do you simulate a contingency table that has a specified row and column sum? Last week I showed how to simulate a random 2 x 2 contingency table when the marginal frequencies are specified. This article generalizes to random r x c frequency tables (also called cross-tabulations) that have the same marginal row
A recent question posted on a discussion forum discussed storing the strictly upper-triangular portion of a correlation matrix. Suppose that you have a correlation matrix like the following: proc iml; corr = {1.0 0.6 0.5 0.4, 0.6 1.0 0.3 0.2, 0.5 0.3 1.0 0.1, 0.4 0.2 0.1 1.0}; Every correlation
When modeling and simulating data, it is important to be able to articulate the real-life statistical process that generates the data. Suppose a friend says to you, "I want to simulate two random correlated variables, X and Y." Usually this means that he wants data generated from a multivariate distribution,
This article shows how to visualize a surface in SAS. You can use the SURFACEPLOTPARM statement in the Graph Template Language (GTL) to create a surface plot. But don't worry, you don't need to know anything about GTL: just copy the code in this article and replace the names of
Suppose that you are tabulating the eye colors of students in a small class (following Friendly, 1992). Depending upon the ethnic groups of these students, you might not observe any green-eyed students. How do you put a 0 into the table that summarizes the number of students who have each
You can use SAS to generate random integers between 1–10 or in the range 1–100. This article shows how to generate random integers as easily as Excel does. I was recently talking with some SAS customers and I was asked "Why can't SAS create an easy way to generate random
In a previous post I described how to simulate random samples from an urn that contains colored balls. The previous article described the case where the balls can be either of two colors. In that csae, all the distributions are univariate. In this article I examine the case where the
If not for probability theory, urns would appear only in funeral homes and anthologies of British poetry. But in probability and statistics, urns are ever present and contain colored balls. The removal and inspection of colored balls from an urn is a classic way to demonstrate probability, sampling, variation, and
You've had a long day. You've implemented a custom algorithm in the SAS/IML language. But before you go home, you want to generate some matrices and test your program. If you are like me, you prefer a short statement—one line would be best. However, you also want the flexibility to
Occasionally a SAS statistical programmer will ask me, "How can I construct a large correlation matrix?" Often they are simulating data with SAS or developing a matrix algorithm that involves a correlation matrix. Typically they want a correlation matrix that is too large to input by hand, such as a
Dear Rick, I have a data set with 1,001 numerical variables. One variable is the response, the others are explanatory variable. How can I read the 1,000 explanatory variables into an IML matrix without typing every name? That's a good question. You need to be able to perform two sub-tasks:
Last week I discussed ordinary least squares (OLS) regression models and showed how to illustrate the assumptions about the conditional distribution of the response variable. For a single continuous explanatory variable, the illustration is a scatter plot with a regression line and several normal probability distributions along the line. The
I've previously written about how to generate a sequence of evenly spaced points in an interval. Evenly spaced data is useful for scoring a regression model on an interval. In the previous articles the endpoints of the interval were hard-coded. However, it is common to want to evaluate a function
A friend who teaches courses about statistical regression asked me how to create a graph in SAS that illustrates an important concept: the conditional distribution of the response variable. The basic idea is to draw a scatter plot with a regression line, then overlay several probability distributions along the line,
Statistical programmers often have to use the results from one SAS procedure as the input to another SAS procedure. Because ODS enables you to you to create a SAS data set from any ODS table or graph, it is easy to obtain a data set that contains the value of
In my previous blog post, I showed how you can use SAS to program a "weaving" algorithm that takes an image, cuts it into strips, and weaves the strips together to create mathematical art. I used matrices and heat maps for the computations and visualization. At the end of the
An artist friend of mine recently created a beautiful abstract image and described the process on her blog. She says that "after painting my initial square, I cut it into strips and split them down the middle, then wove them together.... I had no idea when I started piecing these
The title of this blog post might seem strange, but I occasionally need to compute the number of digits in a number, usually because I am trying to stuff an integer value into a string. Each time, I have to derive the formula from scratch, so I am writing this
Ugh! Your favorite regression procedure just printed a warning to the SAS log. Something is wrong, and your attempt to fit a model to the data has not succeeded. A typical message is "WARNING: The validity of the model fit is questionable," perhaps followed by some additional diagnostic messages about