If you obtain data from web sites, social media, or other unstandardized data sources, you might not know the form of dates in the data. For example, the US Independence Day might be represented as "04JUL1776", "07/04/1776", "Jul 4, 1776", or "July 4, 1776." Fortunately, the ANYDTDTE informat makes it

## Tag: **Reading and Writing Data**

In the SAS/IML language, you can read data from a SAS data set into a set of vectors (each with their own name) or into a single matrix. Beginning programmers might wonder about the advantages of each approach. When should you read data into vectors? When should you read data

Dear Rick, I have a data set with 1,001 numerical variables. One variable is the response, the others are explanatory variable. How can I read the 1,000 explanatory variables into an IML matrix without typing every name? That's a good question. You need to be able to perform two sub-tasks:

I often blog about the usefulness of vectorization in the SAS/IML language. A one-sentence summary of vectorization is "execute a small number of statements that each analyze a lot of data." In general, for matrix languages (SAS/IML, MATLAB, R, ...) vectorization is more efficient than the alternative, which is to

Many people know that the SAS/IML language enables you to read data from and write results to multiple SAS data sets. When you open a new data set, it is a good programming practice to close the previous data set. But did you know that you can have two data

Do you have dozens (or even hundreds) of SAS data sets that you want to read into SAS/IML matrices? In a previous blog post, I showed how to iterate over a series of data sets and analyze each one. Inside the loop, I read each data set into a matrix

One of my favorite features of SAS/IML 12.1 (released with 9.3m2) is that the USE and CLOSE statements support reading data set names that are specified in a SAS/IML matrix. The IMLPlus language in SAS/IML Studio has supported this syntax since the early 2000s, so I am pleased that this

SAS has several kinds of special data sets whose contents are organized according to certain conventions. These special data sets are marked with the TYPE= data set attribute. For example, the CORR procedure can create a data set with the TYPE=CORR attribute. You can decipher the structure of the data

A SAS/IML user on a discussion forum was trying to read data into a SAS/IML matrix, but the data was so large that it would not fit into memory. (Recall that SAS/IML matrices are kept in RAM.) After a few questions, it turned out that the user was trying to

Did you know that you can index into SAS/IML matrices by using unique strings that you assign via the MATTRIB statement? The MATTRIB statement associates various attributes to a matrix. Usually, these attributes are only used for printing, but you can also use the ROWNAME= and COLNAME= attributes to subset

Many SAS procedures can produce ODS statistical graphics as naturally as they produce tables. Did you know that it is possible to obtain the numbers underlying an ODS statistical graph? This post shows how. Suppose that a SAS procedure creates a graph that displays a curve and that you want

I have blogged about three different SAS/IML techniques that iterate over categories and process the observations in each category. The three techniques are as follows: Use a WHERE clause on the READ statement to read only the observations in the ith category. This is described in the article "BY-group processing

One of the first skills that a beginning SAS/IML programmer learns is how to read data from a SAS data set into SAS/IML vectors. (Alternatively, you can read data into a matrix). The beginner is sometimes confused about the syntax of the READ statement: do you specify the names of

Covariance, correlation, and distance matrices are a few examples of symmetric matrices that are frequently encountered in statistics. When you create a symmetric matrix, you only need to specify the lower triangular portion of the matrix. The VECH and SQRVECH functions, which were introduced in SAS/IML 9.3, are two functions

The SAS/IML READ statement has a few convenient features for reading data from SAS data sets. One is that you can read all variables into vectors of the same names by using the _ALL_ keyword. The following DATA steps create a data set called Mixed that contains three numeric and

I got an email asking the following question: In the following program, I don't know how many variables are in the data set A. However, I do know that the variable names are X1–Xk for some value of k. How can I read them all into a SAS/IML matrix when

Being able to reshape data is a useful skill in data analysis. Most of the time you can use the TRANSPOSE procedure or the SAS DATA step to reshape your data. But the SAS/IML language can be handy, too. I only use PROC TRANSPOSE a few times per year, so

This article shows how to randomly access data in a SAS data set by using the READ POINT statement in SAS/IML software. I have previously discussed how to use the READ NEXT and READ CURRENT statements to sequentially access each observation in a SAS data set from PROC IML. Reading

The most common way to read observations from a SAS data set into SAS/IML matrices is to read all of the data at once by using the ALL clause in the READ statement. However, the READ statement also has options that do not require holding all of the observations in

In a previous post, I showed how to read data from a SAS data set into SAS/IML matrices or vectors. This article shows the converse: how to use the CREATE, APPEND, and CLOSE statements to create a SAS data set from data stored in a matrix or in vectors. Creating

Statistical programmers can be creative and innovative. But when it comes to choosing names of variables, often x1, x2, x3,... works as well as any other choice. In this blog post, I have two tips that are related to constructing variable names of the form x1, x2,..., xn. Both tips

As Cat Truxillo points out in her recent blog post, some SAS procedures require data to be in a "long" (as opposed to "wide") format. Cat uses a DATA step to convert the data from wide to long format. Although there is nothing wrong with this approach, I prefer to

Sample covariance matrices and correlation matrices are used frequently in multivariate statistics. This post shows how to compute these matrices in SAS and use them in a SAS/IML program. There are two ways to compute these matrices: Compute the covariance and correlation with PROC CORR and read the results into

Often, the first step of a SAS/IML program is to use the USE, READ, and CLOSE statements to read data from a SAS data set into a vector or matrix. There are several ways to read data: Read variables into vectors of the same name. Read one or more variables

My mother taught me to put things away when I'm finished using them. She doesn't use a computer, but if she did, I know that she'd approve of this tip from my book: Tip: Always close your files and data sets when you are finished reading or writing them. In