Normal, Poisson, exponential—these and other "named" distributions are used daily by statisticians for modeling and analysis. There are four operations that are used often when you work with statistical distributions. In SAS software, the operations are available by using the following four functions, which are essential for every statistical programmer

# Author

I received the following email: Dear Dr. Wicklin, Why doesn't SYMPUT work in IML? In the DATA step, I can say CALL SYMPUT("MyMacro", 5) but this doesn't work in IML! Frustrated Dear Frustrated, The SYMPUT subroutine does work in SAS/IML software! However, the second argument to SYMPUT must be a

I previously wrote about using SAS/IML for nonlinear optimization, and demonstrated optimization by maximizing a likelihood function. Many well-known optimization algorithms require derivative information during the optimization, including the conjugate gradient method (implemented in the NLPCG subroutine) and the Newton-Raphson method (implemented in the NLPNRA method). You should specify analytic

A popular use of SAS/IML software is to optimize functions of several variables. One statistical application of optimization is estimating parameters that optimize the maximum likelihood function. This post gives a simple example for maximum likelihood estimation (MLE): fitting a parametric density estimate to data. Which density curve fits the

To celebrate the first anniversary of Statistical Programming with SAS/IML Software, you can now download the SAS/IML tip sheets (also called "cheat sheets") that I created for the book. At conferences, SAS Press displays these tip sheets next to my book. They have been very popular. Download these SAS/IML cheat

I've noticed that a lot of people want to be able to draw bar charts with confidence intervals. This topic is a frequent posting on the SAS/GRAPH and ODS Graphics Discussion Forum and on the SAS-L mailing list. Consequently, this post describes how to add errors bars to a bar

When you misspell a word on your mobile device or in a word-processing program, the software might "autocorrect" your mistake. This can lead to some funny mistakes, such as the following: I hate Twitter's autocorrect, although changing "extreme couponing" to "extreme coupling" did make THAT tweet more interesting. [@AnnMariaStat] When

SAS has several ways to round a number to an integer. You can round a number up, round it down, or round it to the nearest integer. If your data contain both positive and negative values, you can also round numbers toward zero, or away from zero. The functions that

Birds migrate south in the fall. Squirrels gather nuts. Humans also have behavioral rituals in the autumn. I change the batteries in my smoke detectors, I switch my clocks back to daylight standard time, and I turn the mattress on my bed. The first two are relatively easy. There's even

I previously wrote about an intriguing math puzzle that involves 5-digit numbers with certain properties. This post presents my solution in the SAS/IML language. It is easy to generate all 5-digit perfect squares, but the remainder of the problem involves looking at the digits of the squares. For this reason,

A few sharp-eyed readers questioned the validity of a technique that I used to demonstrate two ways to solve linear systems of equations. I generated a random n x n matrix and then proceeded to invert it, seemingly without worrying about whether the matrix even has an inverse! I responded to the

I was intrigued by a math puzzle posted to the SAS Discussion Forum (from New Scientist magazine). The problem is repeated below, but I have added numbers in brackets because I am going to comment on each clue: [1] I have written down three different 5-digit perfect squares, which [2]

I showed a SAS/IML customer a debugging tip, and she said that I should blog about it because she had never seen it before. The tip is very simple: inside of a DO loop, use the MOD function to selectively print the values of variables. Recall that the expression MOD(a,b)

In my previous post, I blogged about how to sample from a finite mixture distribution. I showed how to simulate variables from populations that are composed of two or more subpopulations. Modeling a response variable as a mixture distribution is an active area of statistics, as judged by many talks

Sometimes a population of individuals is modeled as a combination of subpopulations. For example, if you want to model the heights of individuals, you might first model the heights of males and females separately. The height of the population can then be modeled as a combination of the male and

The other day I encountered a SAS Knowledge Base article that shows how to count the number of missing and nonmissing values for each variable in a data set. However, the code is a complicated macro that is difficult for a beginning SAS programmer to understand. (Well, it was hard

Last week I showed a graph of the number of US births for each day in 2002, which shows a strong day-of-the-week effect. The graph also shows that the number of births on a given day is affected by US holidays. This blog post looks closer at the holiday effect.

Polynomials are used often in data analysis. Low-order polynomials are used in regression to model the relationship between variables. Polynomials are used in numerical analysis for numerical integration and Taylor series approximations. It is therefore important to be able to evaluate polynomials in an efficient manner. My favorite evaluation technique

My friend Chris posted an analysis of the distribution of birthdays for 236 of his Facebook friends. He noted that more of his friends have birthdays in April than in September. The numbers were 28 for April, but only 25 for September. As I reported in my post on "the

You can extend the capability of the SAS/IML language by writing modules. A module is a user-defined function. You can define a module by using the START and FINISH statements. Many people, including myself, define modules at the top of the SAS/IML program in which they are used. You can

Do you know someone who has a birthday in mid-September? Odds are that you do: the middle of September is when most US babies are born, according to data obtained from the National Center for Health Statistics (NCHS) Web site (see Table 1-16). There's an easy way to remember this

Looping is essential to statistical programming. Whether you need to iterate over parameters in an algorithm or indices in an array, a loop is often one of the first programming constructs that a beginning programmer learns. Today is the first anniversary of this blog, which is named The DO Loop,

My elderly mother enjoys playing Scrabble®. The only problem is that my father and most of my siblings won't play with her because she beats them all the time! Consequently, my mother is always excited when I visit because I'll play a few Scrabble games with her. During a recent

I previously showed how to generate random numbers in SAS by using the RAND function in the DATA step or by using the RANDGEN subroutine in SAS/IML software. These functions generate a stream of random numbers. (In statistics, the random numbers are usually a sample from a distribution such as

One of the highly visible changes in SAS 9.3 is the fact that the old LISTING destination is no longer the default destination for ODS output. Instead, the HTML destination is the default. One positive consequence of this is that ODS graphics and tables are interlaced in the output. Another

Exploring correlation between variables is an important part of exploratory data analysis. Before you start to model data, it is a good idea to visualize how variables related to one another. Zach Mayer, on his Modern Toolmaking blog, posted code that shows how to display and visualize correlations in R.

You can generate a set of random numbers in SAS that are uniformly distributed by using the RAND function in the DATA step or by using the RANDGEN subroutine in SAS/IML software. (These same functions also generate random numbers from other common distributions such as binomial and normal.) The syntax

NOTE: SAS stopped shipping the SAS/IML Studio interface in 2018. It is no longer supported, so this article is no longer relevant. When I write SAS/IML programs, I usually do my development in the SAS/IML Studio environment. Why? There are many reasons, but the one that I will discuss today

Some people search the Internet for a set of topics and then use the number of search results ("hits") for each topic to rank the relative popularity of the topics. At the 2011 Joint Statistical Meetings (JSM), I had the opportunity to attend several talks by statisticians from Google and

I've previously described ways to solve systems of linear equations, A*b = c. While discussing the relative merits of the solving a system for a particular right hand side versus solving for the inverse matrix, I made the assertion that it is faster to solve a particular system than it