Rick Wicklin, Author at The DO Loop

Programming Tips

Rick WicklinApril 18, 2018 3

The sweep operator: A fundamental operation in regression

The sweep operator performs elementary row operations on a system of linear equations. The sweep operator enables you to build regression models by "sweeping in" or "sweeping out" particular rows of the X`X matrix. As you do so, the estimates for the regression coefficients, the error sum of squares, and

English

Programming Tips

Rick WicklinApril 16, 2018 0

Random permutations without duplicates

A colleague and I recently discussed how to generate random permutations without encountering duplicates. Given a set of n items, there are n! permutations My colleague wants to generate k unique permutations at random from among the total of n!. Said differently, he wants to sample without replacement from the

English

Learn SAS | Programming Tips

Rick WicklinApril 11, 2018 4

Find the unique rows of a numeric matrix

Sometimes it is important to ensure that a matrix has unique rows. When the data are all numeric, there is an easy way to detect (and delete!) duplicate rows in a matrix. The main idea is to subtract one row from another. Start with the first row and subtract it

English

Work & Life at SAS

Rick WicklinApril 9, 2018 4

Taking in. Giving back.

When we breathe, we breathe in and breathe out. If we choose only one or the other, the results are disastrous. The same principle applies to professional growth and development. Whether we are programmers, statisticians, teachers, students, or writers, we benefit from taking in and giving back. We "take in"

English

Programming Tips

Rick WicklinApril 4, 2018 3

Distance correlation

Correlation is a statistic that measures how closely two variables are related to each other. The most popular definition of correlation is the Pearson product-moment correlation, which is a measurement of the linear relationship between two variables. Many textbooks stress the linear nature of the Pearson correlation and emphasize that

English

Learn SAS | Programming Tips

Rick WicklinApril 2, 2018 5

The chi-square test: An example of working with rows and columns in SAS

As a general rule, when SAS programmers want to manipulate data row by row, they reach for the SAS DATA step. When the computation requires column statistics, the SQL procedure is also useful. When both row and column operations are required, the SAS/IML language is a powerful addition to a

English

Analytics | Data Visualization | Programming Tips

Euclidean and L1 distances between observations and a target value for standardized data

Rick WicklinMarch 28, 2018 3

Find the distances between observations and a target value

Suppose you want to find observations in multivariate data that are closest to a numerical target value. For example, for the students in the Sashelp.Class data set, you might want to find the students whose (Age, Height, Weight) values are closest to the triplet (13, 62, 100). The way to

English

Analytics | Data Visualization

Rick WicklinMarch 26, 2018 3

A zipper plot for visualizing coverage probability in simulation studies

Simulation studies are used for many purposes, one of which is to examine how distributional assumptions affect the coverage probability of a confidence interval. This article describes the "zipper plot," which enables you to compare the coverage probability of a confidence interval when the data do or do not follow

English

Analytics | Learn SAS

Rick WicklinMarch 21, 2018 2

The conjugate gradient method

I often claim that the "natural syntax" of the SAS/IML language makes it easy to implement an algorithm or statistical formula as it appears in a textbook or journal. The other day I had an opportunity to test the truth of that statement. A SAS programmer wanted to implement the

English

Learn SAS | Programming Tips

Rick WicklinMarch 19, 2018 4

Compute with combinations: Maximize a function over combinations of variables

About once a month I see a question on the SAS Support Communities that involves what I like to call "computations with combinations." A typical question asks how to find k values (from a set of p values) that maximize or minimize some function, such as "I have 5 variables,

English

Blogs

Blogs

Author