Blogs

Blogs

Author

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Rick WicklinSeptember 27, 2011 0

A math puzzle: 5-digit squares with certain properties

I was intrigued by a math puzzle posted to the SAS Discussion Forum (from New Scientist magazine). The problem is repeated below, but I have added numbers in brackets because I am going to comment on each clue: [1] I have written down three different 5-digit perfect squares, which [2]

Read More

Rick WicklinSeptember 26, 2011 0

Using the MOD function as a debugging tool

I showed a SAS/IML customer a debugging tip, and she said that I should blog about it because she had never seen it before. The tip is very simple: inside of a DO loop, use the MOD function to selectively print the values of variables. Recall that the expression MOD(a,b)

Read More

Rick WicklinSeptember 23, 2011 0

Modeling Finite Mixtures with the FMM Procedure

In my previous post, I blogged about how to sample from a finite mixture distribution. I showed how to simulate variables from populations that are composed of two or more subpopulations. Modeling a response variable as a mixture distribution is an active area of statistics, as judged by many talks

Read More

Rick WicklinSeptember 21, 2011 0

Generate a random sample from a mixture distribution

Sometimes a population of individuals is modeled as a combination of subpopulations. For example, if you want to model the heights of individuals, you might first model the heights of males and females separately. The height of the population can then be modeled as a combination of the male and

Read More

Programming Tips

Count the number of missing values in each variable in SAS

Rick WicklinSeptember 19, 2011 0

Count the number of missing values for each variable

The other day I encountered a SAS Knowledge Base article that shows how to count the number of missing and nonmissing values for each variable in a data set. However, the code is a complicated macro that is difficult for a beginning SAS programmer to understand. (Well, it was hard

Read More

Rick WicklinSeptember 16, 2011 0

The effect of holidays on US births

Last week I showed a graph of the number of US births for each day in 2002, which shows a strong day-of-the-week effect. The graph also shows that the number of births on a given day is affected by US holidays. This blog post looks closer at the holiday effect.

Read More

Rick WicklinSeptember 14, 2011 0

Evaluate polynomials efficiently by using Horner's scheme

Polynomials are used often in data analysis. Low-order polynomials are used in regression to model the relationship between variables. Polynomials are used in numerical analysis for numerical integration and Taylor series approximations. It is therefore important to be able to evaluate polynomials in an efficient manner. My favorite evaluation technique

Read More

Rick WicklinSeptember 13, 2011 0

The birthday controversy: Are more people born in April or September?

My friend Chris posted an analysis of the distribution of birthdays for 236 of his Facebook friends. He noted that more of his friends have birthdays in April than in September. The numbers were 28 for April, but only 25 for September. As I reported in my post on "the

Read More

Rick WicklinSeptember 12, 2011 0

Storing and loading modules

You can extend the capability of the SAS/IML language by writing modules. A module is a user-defined function. You can define a module by using the START and FINISH statements. Many people, including myself, define modules at the top of the SAS/IML program in which they are used. You can

Read More

Rick WicklinSeptember 9, 2011 0

The most likely birthday in the US

Do you know someone who has a birthday in mid-September? Odds are that you do: the middle of September is when most US babies are born, according to data obtained from the National Center for Health Statistics (NCHS) Web site (see Table 1-16). There's an easy way to remember this

Read More

Programming Tips

Schematic flow of a DO loop in SAS

Rick WicklinSeptember 7, 2011 0

Loops in SAS

Looping is essential to statistical programming. Whether you need to iterate over parameters in an algorithm or indices in an array, a loop is often one of the first programming constructs that a beginning programmer learns. Today is the first anniversary of this blog, which is named The DO Loop,

Read More

Rick WicklinSeptember 2, 2011 0

Visualizing Scrabble games

My elderly mother enjoys playing Scrabble®. The only problem is that my father and most of my siblings won't play with her because she beats them all the time! Consequently, my mother is always excited when I visit because I'll play a few Scrabble games with her. During a recent

Read More

Rick WicklinAugust 31, 2011 0

Random number streams in SAS: How do they work?

I previously showed how to generate random numbers in SAS by using the RAND function in the DATA step or by using the RANDGEN subroutine in SAS/IML software. These functions generate a stream of random numbers. (In statistics, the random numbers are usually a sample from a distribution such as

Read More

Programming Tips

Rick WicklinAugust 29, 2011 0

How to clear the output window in SAS 9.3

One of the highly visible changes in SAS 9.3 is the fact that the old LISTING destination is no longer the default destination for ODS output. Instead, the HTML destination is the default. One positive consequence of this is that ODS graphics and tables are interlaced in the output. Another

Read More

Rick WicklinAugust 26, 2011 0

Visualizing correlations between variables in SAS

Exploring correlation between variables is an important part of exploratory data analysis. Before you start to model data, it is a good idea to visualize how variables related to one another. Zach Mayer, on his Modern Toolmaking blog, posted code that shows how to display and visualize correlations in R.

Read More

Programming Tips

Rick WicklinAugust 24, 2011 0

How to generate random numbers in SAS

You can generate a set of random numbers in SAS that are uniformly distributed by using the RAND function in the DATA step or by using the RANDGEN subroutine in SAS/IML software. (These same functions also generate random numbers from other common distributions such as binomial and normal.) The syntax

Read More

Rick WicklinAugust 22, 2011 0

Multithreaded = more productive

NOTE: SAS stopped shipping the SAS/IML Studio interface in 2018. It is no longer supported, so this article is no longer relevant. When I write SAS/IML programs, I usually do my development in the SAS/IML Studio environment. Why? There are many reasons, but the one that I will discuss today

Read More

Rick WicklinAugust 19, 2011 0

Estimating popularity based on Google searches: Why it's a bad idea

Some people search the Internet for a set of topics and then use the number of search results ("hits") for each topic to rank the relative popularity of the topics. At the 2011 Joint Statistical Meetings (JSM), I had the opportunity to attend several talks by statisticians from Google and

Read More

Rick WicklinAugust 17, 2011 0

Solving linear systems: Which technique is fastest?

I've previously described ways to solve systems of linear equations, A*b = c. While discussing the relative merits of the solving a system for a particular right hand side versus solving for the inverse matrix, I made the assertion that it is faster to solve a particular system than it

Read More

Rick WicklinAugust 15, 2011 0

Complex assignment statements: CHOOSE wisely

This article describes the SAS/IML CHOOSE function: how it works, how it doesn't work, and how to use it to make your SAS/IML programs more compact. In particular, the CHOOSE function has a potential "gotcha!" that you need to understand if you want your program to perform as expected. What

Read More

Rick WicklinAugust 12, 2011 0

Side-by-side bar plots in SAS 9.3

When I was at the Joint Statistical Meetings (JSM) last week, a SAS customer asked me whether it was possible to use the SGPLOT procedure to produce side-by-side bar charts. The answer is "yes" in SAS 9.3, thanks to the new GROUPDISPLAY= option on the VBAR and HBAR statements. For

Read More

Rick WicklinAugust 10, 2011 0

Do you really need to compute that matrix inverse?

The SAS/IML language provides two functions for solving a nonsingular nxn linear system A*x = c: The INV function numerically computes the inverse matrix, A-1. You can use this to solve for x: Ainv = inv(A); x = Ainv*c;. The SOLVE function numerically computes the particular solution, x, for a

Read More

Rick WicklinAugust 8, 2011 0

Please Eat My Dear Aunt Sally's Italian Lasagna

In the SAS/IML language, the index creation operator (:) is used to construct a sequence of integer values. For example, the expression 1:7 creates a row vector with seven elements: 1, 2, ..., 7. It is important to know the precedence of matrix operators. When I was in grade school,

Read More

Rick WicklinAugust 5, 2011 0

Using Newton's method to find the zero of a function

I've previously discussed how to find the root of a univariate function. This article describes how to find the root (zero) of a function of several variables by using Newton's method. There have been many papers, books, and dissertations written on the topic of root-finding, so why am I blogging

Read More

Rick WicklinAugust 3, 2011 0

Finding the root of a univariate function

At the SAS/IML Support Community, a SAS/IML programmer recently asked how to find "the root of a complicated equation." That's a huge question, and many papers and books have been written on the topic of root-finding, also known as finding the zeros of a function. Everyone has favorite techniques for

Read More

Rick WicklinAugust 1, 2011 0

Options for Printing a Matrix

A matrix is an array of numbers or character strings. When I print a matrix, I usually want to see only the data. However, sometimes it is helpful to add row or column headings that indicate the names of variables or labels for rows. A simple example is count data

Read More

Rick WicklinJuly 29, 2011 0

Computing an ROC curve from basic principles

In a previous blog post, I showed how to use the LOGISTIC procedure to construct a receiver operator characteristic (ROC) curve in SAS. That same day, Charlie H. blogged about how to use the DATA step to construct an ROC curve from basic principles. It has been a long time

Read More

Rick WicklinJuly 27, 2011 0

Add a diagonal line to a scatter plot: The easy way

I've written about how to add a diagonal line to a scatter plot by using the SGPLOT procedure in SAS 9.2. The main idea (use the VECTOR statement) is easy enough, but writing a program that handles a line with any slope requires some additional effort. But now SAS 9.3

Read More

Rick WicklinJuly 25, 2011 0

A simple signum function

The other day I needed to compute the signum function for each element of a matrix. If x is a real number, then the sgn(x) is -1 when x<0, 1 when x>0, and 0 when x=0. I wrote a SAS/IML module that contains a compact little expression: proc iml; start

Read More

Rick WicklinJuly 22, 2011 0

Simulating the Coupon Collector's Problem

I recently blogged about how many times, on average, you must roll a die until you see all six faces. This question is a special case of the coupon collector's problem. My son noted that the expected value (the mean number of rolls) is not necessarily the best statistic to

Read More

Previous 1 … 44 45 46 47 48 … 52 Next