Blogs

Blogs

Tag: Statistical Programming

Advanced Analytics

Rick WicklinMarch 14, 2014 0

For pi day: A continued fraction expansion of pi

Many geeky mathematical people celebrate "pi day" on March 14, because the date is written 3/14 in the US, which is evocative of the decimal representation of π = 3.14.... Most people are familiar with the decimal representation of π. The media occasionally reports on a new computational tour-de-force that

Read More

Advanced Analytics

Rick WicklinFebruary 24, 2014 0

Three ways to add a smoothing spline to a scatter plot in SAS

Like many SAS programmers, I use the Statistical Graphics (SG) procedures to graph my data in SAS. To me, the SGPLOT and SGRENDER procedures are powerful, easy to use, and produce fabulous ODS graphics. I was therefore surprised when a SAS customer told me that he continues to use the

Read More

Advanced Analytics | Learn SAS

Rick WicklinFebruary 19, 2014 0

Techniques for scoring a regression model in SAS

My previous post described how to use the "missing response trick" to score a regression model. As I said in that article, there are other ways to score a regression model. This article describes using the SCORE procedure, a SCORE statement, the relatively new PLM procedure, and the CODE statement.

Read More

Advanced Analytics | Learn SAS

Rick WicklinFebruary 17, 2014 0

The missing value trick for scoring a regression model

A fundamental operation in statistical data analysis is to fit a statistical regression model on one set of data and then evaluate the model on another set of data. The act of evaluating the model on the second set of data is called scoring. One of first "tricks" that I

Read More

Advanced Analytics

Rick WicklinJanuary 21, 2014 0

The best articles of 2013: Twelve posts from The DO Loop that merit a second look

I began 2014 by compiling a list of 13 popular articles from my blog in 2013. Although this "People's Choice" list contains many articles that I am proud of, it did not include all of my favorites, so I decided to compile an "Editor's Choice" list. The blog posts on

Read More

Advanced Analytics

Rick WicklinJanuary 7, 2014 0

13 popular articles from 2013

In 2013 I published 110 blog posts. Some of these articles were more popular than others, often because they were linked to from a SAS newsletter such as the SAS Statistics and Operations Research News. In no particular order, here are some of my most popular posts from 2013, organized

Read More

Advanced Analytics

Rick WicklinDecember 4, 2013 0

Secret Santa: What is the probability that someone pulls her own name from a hat?

Each year my siblings choose names for a Christmas gift exchange. It is not unusual for a sibling to pick her own name, whereupon the name is replaced into the hat and a new name is drawn. In fact, that "glitch" in the drawing process was a motivation for me

Read More

Advanced Analytics

Rick WicklinNovember 25, 2013 0

Twelve advantages to calling R from the SAS/IML language

For several years, there has been interest in calling R from SAS software, primarily because of the large number of special-purpose R packages. The ability to call R from SAS has been available in SAS/IML since 2009. Previous blog posts about R include a video on how to call R

Read More

Advanced Analytics

Rick WicklinNovember 20, 2013 0

Write a reusable SAS/IML module that passes values to R

When I call R from within the SAS/IML language, I often pass parameters from SAS into R. This feature enables me to write general-purpose, reusable, modules that can analyze data from many different data sets. I've previously blogged about how to pass values to SAS procedures from PROC IML by

Read More

Advanced Analytics | Learn SAS

Rick WicklinSeptember 30, 2013 0

Generate combinations in SAS

Last week I described how to generate permutations in SAS. A related concept is the "combination." In probability and statistics, a combination is a subset of k items chosen from a set that contains N items. Order does not matter, so although the ordered triplets (B, A, C) and (C,

Read More

Advanced Analytics

Rick WicklinSeptember 25, 2013 0

Compute contours of the bivariate normal CDF

This is the last post in my recent series of articles on computing contours in SAS. Last month a SAS customer asked how to compute the contours of the bivariate normal cumulative distribution function (CDF). Answering that question in a single blog post would have resulted in a long article,

Read More

Advanced Analytics | Learn SAS

Rick WicklinSeptember 23, 2013 0

Generate permutations in SAS

I've written several articles that show how to generate permutations in SAS. In the SAS DATA step, you can use the ALLPEM subroutine to generate all permutations of a DATA step array that contain a small number (18 or fewer) elements. In addition, the PLAN procedure enables you to generate

Read More

Advanced Analytics

Rick WicklinJuly 26, 2013 0

How to choose parameters so that a distribution has a specified mean and variance

The truncated normal distribution TN(μ, σ, a, b) is the distribution of a normal random variable with mean μ and standard deviation σ that is truncated on the interval [a, b]. I previously blogged about how to implement the truncated normal distribution in SAS. A friend wanted to simulate data

Read More

Advanced Analytics

Rick WicklinJune 24, 2013 0

Count the number of unique rows in a matrix

How do you count the number of unique rows in a matrix? The simplest algorithm is to sort the data and then iterate down the rows, comparing each row with the previous row. However, this algorithm has two shortcomings: it physically sorts the data (which means that the original locations

Read More

Advanced Analytics

Rick WicklinJune 5, 2013 0

Using simulation to compute a power curve

Last week I showed how to use simulation to estimate the power of a statistical test. I used the two-sample t test to illustrate the technique. In my example, the difference between the means of two groups was 1.2, and the simulation estimated a probability of 0.72 that the t

Read More

Advanced Analytics

Rick WicklinMay 30, 2013 0

Using simulation to estimate the power of a statistical test

The power of a statistical test measures the test's ability to detect a specific alternate hypothesis. For example, educational researchers might want to compare the mean scores of boys and girls on a standardized test. They plan to use the well-known two-sample t test. The null hypothesis is that the

Read More

Advanced Analytics

Rick WicklinApril 29, 2013 0

Understanding local and global variables in the SAS/IML language

The TV show Cheers was set in a bar "where everybody knows your name." Global knowledge of a name is appealing for a neighborhood pub, but not for a programming language. Most programming languages enable you to define functions that have local variables: variables whose names are known only inside

Read More

Advanced Analytics

Rick WicklinApril 24, 2013 0

How to overlay a custom density curve on a histogram in SAS

I've previously described how to overlay two or more density curves on a single plot. I've also written about how to use PROC SGPLOT to overlay custom curves on a graph. This article describes how to overlay a density curve on a histogram. For common distributions, you can overlay a

Read More

Advanced Analytics

Rick WicklinApril 8, 2013 0

Point/Counterpoint: Where should you put ODS SELECT and ODS OUTPUT statements?

ODS statements are global SAS statements. As such, you can put them anywhere in your SAS program. For maximum readability, many SAS programmers agree that most ODS statements should appear outside procedures in "open" SAS code. For example, most programmers agree that the following statements should appear outside of procedures:

Read More

Advanced Analytics

Rick WicklinMarch 27, 2013 0

How to compute the distance between observations in SAS

In statistics, distances between observations are used to form clusters, to identify outliers, and to estimate distributions. Distances are used in spatial statistics and in other application areas. There are many ways to define the distance between observations. I have previously written an article that explains Mahalanobis distance, which is

Read More

Advanced Analytics

Rick WicklinMarch 20, 2013 0

Understanding ridge regression in SAS

Someone recently asked a question on the SAS Support Communities about estimating parameters in ridge regression. I answered the question by pointing to a matrix formula in the SAS documentation. One of the advantages of the SAS/IML language is that you can implement matrix formulas in a natural way. The

Read More

Advanced Analytics

Rick WicklinMarch 13, 2013 0

The case of spilled coffee and the regression intercept

Argh! I've just spilled coffee on output that shows the least squares coefficients for a regression model that I was investigating. Now the parameter estimate for the intercept is completely obscured, although I can still see the parameter estimates for the coefficients of the continuous explanatory variable. What can I

Read More

Advanced Analytics

Rick WicklinMarch 11, 2013 0

Construct normal data from summary statistics

Last week there was an interesting question posted to the "Stat-Math Statistics" group on LinkedIn. The original question was a little confusing, so I'll state it in a more general form: A population is normally distributed with a known mean and standard deviation. A sample of size N is drawn

Read More

Advanced Analytics

Rick WicklinFebruary 6, 2013 0

Find variables common to multiple data sets

Last week the SAS Training Post blog posted a short article on an easy way to find variables in common to two data sets. The article used PROC CONTENTS (with the SHORT option) to print out the names of variables in SAS data sets so that you can visually determine

Read More

Advanced Analytics

Rick WicklinJanuary 16, 2013 0

Generate binary outcomes with varying probability

A while ago I saw a blog post on how to simulate Bernoulli outcomes when the probability of generating a 1 (success) varies from observation to observation. I've done this often in SAS, both in the DATA step and in the SAS/IML language. For example, when simulating data that satisfied

Read More

Advanced Analytics | Learn SAS

Rick WicklinJanuary 3, 2013 0

12 Tips for SAS Statistical Programmers

It's the start of a new year. Have you made a resolution to be a better data analyst? A better SAS statistical programmer? To learn more about multivariate statistics? What better way to start the New Year than to read (or re-read!) the top 12 articles for statistical programmers from

Read More

Advanced Analytics

Rick WicklinNovember 28, 2012 0

Computing the nearest correlation matrix

Frequently someone will post a question to the SAS Support Community that says something like this: I am trying to do [statistical task]and SAS issues an error and reports that my correlation matrix is not positive definite. What is going on and how can I complete [the task]? The statistical

Read More

Advanced Analytics

Rick WicklinNovember 7, 2012 0

Constructing block matrices with applications to mixed models

The other day I was constructing covariance matrices for simulating data for a mixed model with repeated measurements. I was using the SAS/IML BLOCK function to build up the "R-side" covariance matrix from smaller blocks. The matrix I was constructing was block-diagonal and looked like this: The matrix represents a

Read More

Advanced Analytics

Rick WicklinOctober 31, 2012 0

Compute the log-determinant of a matrix

The determinant of a matrix arises in many statistical computations, such as in estimating parameters that fit a distribution to multivariate data. For example, if you are using a log-likelihood function to fit a multivariate normal distribution, the formula for the log-likelihood involves the expression log(det(Σ)), where Σ is the

Read More

Advanced Analytics

Rick WicklinOctober 10, 2012 0

Playing "craps" with unfair dice

Last week I wrote a SAS/IML program that computes the odds of winning the game of craps. I noted that the program remains valid even if the dice are not fair. For convenience, here is a SAS/IML function that computes the probability of winning at craps, given the probability vector

Read More

Previous 1 … 8 9 10 11 12 … 15 Next