Move beyond spreadsheets to data mining, forecasting, optimization – and more

0
Two-Letter initials: Which are the most common?

A colleague related the following story: He was taking notes at a meeting that was attended by a fairly large group of people (about 20). As each person made a comment or presented information, he recorded the two-letter initials of the person who spoke. After the meeting was over, he

0
Sampling from the multivariate normal distribution

SAS/IML software is often used for sampling and simulation studies. For simulating data from univariate distributions, the RANDSEED and RANDGEN subroutines suffice to sample from a wide range of distributions. (I use the terms "sampling from a distribution" and "simulating data from a distribution" interchangeably.) For multivariate simulations, the IMLMLIB

0

Computing probabilities can be tricky. And if you are a statistician and you get them wrong, you feel pretty foolish. That's why I like to run a quick simulation just to make sure that the numbers that I think are correct are, in fact, correct. My last post of 2010

0
When sharks attack!

The Junk Chart blog discusses problems with a chart which (poorly) presents statistics on the prevalence of shark attacks by different species. Here is the same data presented by overlaying two bar charts by using the SGPLOT procedure. I think this approach works well because the number of deaths is

0
New Year's resolutions for my blog

It's a New Year and I'm ready to make some resolutions. Last year I launched this blog with my Hello, World post in which I said: In this blog I intend to discuss, describe, and disseminate ideas related to statistical programming with the SAS/IML language.... I will present tips and

0
Automating the Great Christmas Gift Exchange

In many families, siblings draw names so that each family member and spouse gives and receives exactly one present. This year there was a little bit of controversy when a family member noticed that once again she was assigned to give presents to me. This post includes my response to

0
In Defense of Outliers

If outliers could scream, would we be so cavalier about removing them from our history, and excluding them from our statistical forecasting models? Well, maybe we would – if they screamed all the time, and for no good reason. (This sentiment is adapted from my favorite of the many Deep

0
United against insurance fraud

This year, SAS joined the Coalition Against Insurance Fraud (CAIF). Not long ago I had a chance to attend my first CAIF meeting and I was very impressed with what I saw and heard. Fraud continues to be a thorn in the side of the insurance industry. The most recent

0
Is the index to my book abnormally long?

When I finished writing my book, Statistical Programming with SAS/IML Software, I was elated. However, one small task still remained. I had to write the index. How Long Should an Index Be? My editor told me that SAS Press would send the manuscript to a professional editor who would index

0
The module that vanished

Recently, I needed to detect whether a matrix consists entirely of missing values. I wrote the following module: proc iml; /** Module to detect whether all elements of a matrix are missing values. Works for both numeric and character matrices. Version 1 (not optimal) **/ start isMissing(x); if type(x)='C' then

0
How to find and fix programming errors

NOTE: SAS stopped shipping the SAS/IML Studio interface in 2018. The references in this article to IMLPlus and SAS/IML Studio are no longer relevant. There are three kinds of programming errors: parse-time errors, run-time errors, and logical errors. It doesn't matter what language you are using (SAS/IML, MATLAB, R, C/C++,

0
SAS/IIF Forecasting Research Grants

While insufficiently endowed to be called a "get rich quick" scheme, here is a good way to pocket an extra \$5,000 for your holiday shopping budget, and contribute to the body of forecasting knowledge. For the ninth straight year, SAS announces funding of two \$5,000 research grants to be awarded

0
Converting between correlation and covariance matrices

Both covariance matrices and correlation matrices are used frequently in multivariate statistics. You can easily compute covariance and correlation matrices from data by using SAS software. However, sometimes you are given a covariance matrix, but your numerical technique requires a correlation matrix. Other times you are given a correlation matrix,

0
Computing covariance and correlation matrices

Sample covariance matrices and correlation matrices are used frequently in multivariate statistics. This post shows how to compute these matrices in SAS and use them in a SAS/IML program. There are two ways to compute these matrices: Compute the covariance and correlation with PROC CORR and read the results into

0
Placing the digits of a number into a vector

I enjoy reading about the Le Monde puzzles (and other topics!) at Christian Robert's blog. Recently he asked how to convert a number with s digits into a numerical vector where each element of the vector contains the corresponding digit (by place value). For example, if the number is 4321,

0
Shorthand notation for row and column operations

The SAS/IML language enables you to perform matrix-vector computations. However, it also provides a convenient "shorthand notation" that enables you to perform elementwise operation on rows or columns in a natural way. You might know that the SAS/IML language supports subscript reduction operators to compute basic rowwise or columnwise quantities.

0
Forecasting's Eternal Questions

I'm back in the office after two enjoyable days at the Internet Summit in Raleigh, NC. (I hadn't seen that many nerds since the family reunion on my dad's side.) Among the many good sessions was one about building your blog audience by making the blog more search friendly. The

0
How to interpret SAS/IML error messages

Errors. We all make them. After all, “to err is human.” Or, as programmers often say, “To err is human, but to really foul things up requires a computer” (Farmer’s Almanac, 1978). This post describes how to interpret error messages from PROC IML that appear in the SAS log. The

0
Launching SAS/IML Studio

I give many presentations and workshops on how to use SAS/IML Studio, and more than once I have been asked about how to launch the program. Sometimes the inquiry hints at mild frustration, such as last week's "How do I RUN the \$%#@# THING!!!!" The email I got this week

0
Resampling and simulating my grocery bills

In a previous post, I used statistical data analysis to estimate the probability that my grocery bill is a whole-dollar amount such as \$86.00 or \$103.00. I used three weeks' grocery receipts to show that the last two digits of prices on items that I buy are not uniformly distributed.

0
Regression coefficients for orthogonal polynomials

In a previous post, I discussed computing regression coefficients in different polynomial bases and showed how the coefficients change when you change the basis functions. In particular, I showed how to convert the coefficients computed in one basis to coefficients computed with respect to a different basis. It turns out