The SAS/IML language has several functions for finding the unions, intersections, and differences between sets. In fact, two of my favorite utility functions are the UNIQUE function, which returns the unique elements in a matrix, and the SETDIF function, which returns the elements that are in one vector and not
Author
Many geeky mathematical people celebrate "pi day" on March 14, because the date is written 3/14 in the US, which is evocative of the decimal representation of π = 3.14.... Most people are familiar with the decimal representation of π. The media occasionally reports on a new computational tour-de-force that
Last week I showed how to find parameters that maximize the integral of a certain probability density function (PDF). Because the function was a PDF, I could evaluate the integral by calling the CDF function in SAS. (Recall that the cumulative distribution function (CDF) is the integral of a PDF.)
On most Mondays I blog about a function, programming technique, or resource that is useful for programmers who are getting started with SAS software. Recently I learned that my colleagues in the SAS education division have been hard at work developing a series of short videos that explain basic tasks
SAS programmers use the SAS/IML language for many different tasks. One important task is computing an integral. Another is optimizing functions, such as maximizing a likelihood function to find parameters that best fit a set of data. Last week I saw an interesting problem that combines these two important tasks.
A colleague sent me an interesting question: What is the best way to abort a SAS/IML program? For example, you might want to abort a program if the data is singular or does not contain a sufficient number of observations or variables. As a first attempt would be to try
My last blog post described three ways to add a smoothing spline to a scatter plot in SAS. I ended the post with a cautionary note: From a statistical point of view, the smoothing spline is less than ideal because the smoothing parameter must be chosen manually by the user.
Like many SAS programmers, I use the Statistical Graphics (SG) procedures to graph my data in SAS. To me, the SGPLOT and SGRENDER procedures are powerful, easy to use, and produce fabulous ODS graphics. I was therefore surprised when a SAS customer told me that he continues to use the
My previous post described how to use the "missing response trick" to score a regression model. As I said in that article, there are other ways to score a regression model. This article describes using the SCORE procedure, a SCORE statement, the relatively new PLM procedure, and the CODE statement.
A fundamental operation in statistical data analysis is to fit a statistical regression model on one set of data and then evaluate the model on another set of data. The act of evaluating the model on the second set of data is called scoring. One of first "tricks" that I
Although I currently work as a statistician, my original training was in mathematics. In many mathematical fields there is a result that is so profound that it earns the name "The Fundamental Theorem of [Topic Area]." A fundamental theorem is a deep (often surprising) result that connects two or more
One of my favorite new features of SAS/IML 12.1 enables you to define functions that contain default values for parameters. This is extremely useful when you want to write a function that has optional arguments. Example: Centering a data vector It is simple to specify a SAS/IML module with a
Finding the root (or zero) of a function is an important computational task because it enables you to solve nonlinear equations. I have previously blogged about using Newton's method to find a root for a function of several variables. I have also blogged about how to use the bisection method
Last week I showed three ways to sample with replacement in SAS. You can use the SAMPLE function in SAS/IML 12.1 to sample from a finite set or you can use the DATA step or PROC SURVEYSELECT to extract a random sample from a SAS data set. Sampling without replacement
Randomly choosing a subset of elements is a fundamental operation in statistics and probability. Simple random sampling with replacement is used in bootstrap methods (where the technique is called resampling), permutation tests and simulation. Last week I showed how to use the SAMPLE function in SAS/IML software to sample with
Prime numbers are strange beasts. They exhibit properties of both randomness and regularity. Recently I watched an excellent nine-minute video on the Numberphile video blog that shows that if you write the natural numbers in a spiral pattern (called the Ulam spiral), then there are certain lines in the pattern
With each release of SAS/IML software, the language provides simple ways to carry out tasks that previously required more effort. In 2010 I blogged about a SAS/IML module that appeared in my book Statistical Programming with SAS/IML Software, which was written by using the SAS/IML 9.2. The blog post showed
I began 2014 by compiling a list of 13 popular articles from my blog in 2013. Although this "People's Choice" list contains many articles that I am proud of, it did not include all of my favorites, so I decided to compile an "Editor's Choice" list. The blog posts on
Late last month, while many of us were sipping eggnog and decking halls with boughs of holly, SAS released the 13.1 version of its analytical products. Readers of Maura Stokes' newsletter, SAS Statistics and Operations Research News (Nov 2013), have already been alerted to new features in products such as
Vector languages such as SAS/IML, MATLAB, and R are powerful because they enable you to use high-level matrix operations (matrix multiplication, dot products, etc) rather than loops that perform scalar operations. In general, vectorized programs are more efficient (and therefore run faster) than programs that contain loops. For an example
In 2013 I published 110 blog posts. Some of these articles were more popular than others, often because they were linked to from a SAS newsletter such as the SAS Statistics and Operations Research News. In no particular order, here are some of my most popular posts from 2013, organized
O Christmas tree, O Christmas tree, Last year a fractal made thee! O Christmas tree, O Christmas tree, A heat map can display thee! O tree of green, adorned with lights! A trunk of brown, the rest is white. O Christmas tree, O Christmas tree, A heat map can display
Recently a SAS/IML programmer asked a question regarding how to perform matrix arithmetic when some of the data are in vectors and other are in matrices. The programmer wanted to add the following matrices: The problem was that the numbers in the first two matrices were stored in vectors. The
A heat map is a graphical representation of a matrix that uses colors to represent values in the matrix cells. Heat maps often reveal the structure of a matrix. There are three common applications of visualizing matrices with heat maps: Visualizing a correlation or covariance matrix reveals relationships between variables.
When learning a new language, it is important to learn to interpret error messages that come from the language's parser or compiler. Three years ago I blogged about how to interpret SAS/IML error messages. However, many questions have been posted to the SAS/IML Support Community that indicate that some people
As the International Year of Statistics comes to a close, I've been reflecting on the role statistics plays in our modern society. Of course, statistics provides estimates, forecasts, and the like, but to me the great contribution of statistics is that it enables us to deal with uncertainty in a
Each year my siblings choose names for a Christmas gift exchange. It is not unusual for a sibling to pick her own name, whereupon the name is replaced into the hat and a new name is drawn. In fact, that "glitch" in the drawing process was a motivation for me
If you write an n x p matrix from PROC IML to a SAS data set, you'll get a data set with n rows and p columns. For some applications, it is more convenient to write the matrix in a "long format" with np observations and three columns. The first
For several years, there has been interest in calling R from SAS software, primarily because of the large number of special-purpose R packages. The ability to call R from SAS has been available in SAS/IML since 2009. Previous blog posts about R include a video on how to call R
When I call R from within the SAS/IML language, I often pass parameters from SAS into R. This feature enables me to write general-purpose, reusable, modules that can analyze data from many different data sets. I've previously blogged about how to pass values to SAS procedures from PROC IML by