Blogs

Blogs

Author

Rick Wicklin

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Analytics | Data Visualization | Learn SAS

Rick WicklinNovember 4, 2019 0

How to interpret graphs in a principal component analysis

Understanding multivariate statistics requires mastery of high-dimensional geometry and concepts in linear algebra such as matrix factorizations, basis vectors, and linear subspaces. Graphs can help to summarize what a multivariate analysis is telling us about the data. This article looks at four graphs that are often part of a principal

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 30, 2019 0

Compute the first or last day of a month or year

Every year at Halloween, I post an article that shows a SAS trick that is a real treat. This article shows how to use the INTNX function to find dates that are related to a specified date. The INTNX function is a sweet treat, indeed. I previously wrote an article

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 28, 2019 0

Use regular expressions to specify variable names in SAS

A common task in SAS programming is to specify a list of variables that satisfy some pattern. You can specify lists for the KEEP= or DROP= data set options, and you can use lists of variables on many SAS statements such as the VAR and MODEL statements. Although SAS has

Read More

Programming Tips

Rick WicklinOctober 23, 2019 0

Perform matrix computations when the matrices don't fit in memory

In response to a recent article about how to compute the cosine similarity of observations, a reader asked whether it is practical (or even possible) to perform these types of computations on data sets that have many thousands of observations. The problem is that the cosine similarity matrix is an

Read More

Analytics | Data Visualization | Learn SAS

Rick WicklinOctober 21, 2019 0

Compute and visualize binomial proportions in SAS

Computing rates and proportions is a common task in data analysis. When you are computing several proportions, it is helpful to visualize how the rates vary among subgroups of the population. Examples of proportions that depend on subgroups include: Mortality rates for various types of cancers Incarceration rates by race

Read More

Analytics | Data Visualization

Rick WicklinOctober 16, 2019 0

Visualize a regression with splines

The EFFECT statement is supported by more than a dozen SAS/STAT regression procedures. Among other things, it enables you to generate spline effects that you can use to fit nonlinear relationships in data. Recently there was a discussion on the SAS Support Communities about how to interpret the parameter estimates

Read More

Analytics | Learn SAS

Rick WicklinOctober 14, 2019 0

Compute the geometric mean for many variables in SAS

I recently wrote about how to use PROC TTEST in SAS/STAT software to compute the geometric mean and related statistics. This prompted a SAS programmer to ask a related question. Suppose you have dozens (or hundreds) of variables and you want to compute the geometric mean of each. What is

Read More

Analytics | Data Visualization

Rick WicklinOctober 9, 2019 0

What statistic should you use to display error bars for a mean?

In a previous article, I mentioned that the VLINE statement in PROC SGPLOT is an easy way to graph the mean response at a set of discrete time points. I mentioned that you can choose three options for the length of the "error bars": the standard deviation of the data,

Read More

Data Visualization | Learn SAS

Rick WicklinOctober 7, 2019 0

Graph the mean response versus time in SAS

It is always great to read an old paper or blog post and think, "This task is so much easier in SAS 9.4!" I had that thought recently when I stumbled on a 2007 paper by Wei Cheng titled "Graphical Representation of Mean Measurement over Time." A substantial portion of

Read More

Analytics | Programming Tips

Rick WicklinOctober 2, 2019 0

Compute the geometric mean, geometric standard deviation, and geometric CV in SAS

I frequently see questions on SAS discussion forums about how to compute the geometric mean and related quantities in SAS. Unfortunately, the answers to these questions are sometimes confusing or even wrong. In addition, some published papers and web sites that claim to show how to calculate the geometric mean

Read More

Programming Tips

Rick WicklinSeptember 30, 2019 0

What is a geometric mean?

There are several different kinds of means. They all try to find an average value from among a set of numbers. Although the most popular mean is the arithmetic mean, the geometric mean can be useful for problems in statistics, finance, and biology. A common application of the geometric mean

Read More

Analytics | Programming Tips

Rick WicklinSeptember 25, 2019 0

Solve many optimization problems

One of the strengths of the SAS/IML language is its flexibility. Recently, a SAS programmer asked how to generalize a program in a previous article. The original program solved one optimization problem. The reader said that she wants to solve this type of problem 300 times, each time using a

Read More

Learn SAS

Rick WicklinSeptember 23, 2019 0

Use the SHORT option in Base SAS procedures to reduce output

Although I do not typically blog about undocumented SAS options, I'll make an exception this time. For many years, I have known that the CONTENTS and COMPARE procedures support the BRIEF and SHORT options, but I always forget which option goes with which procedure. For the record, here are the

Read More

Data Visualization

Rick WicklinSeptember 18, 2019 0

4 ways to visualize the density of bivariate data

In a scatter plot that displays many points, it can be important to visualize the density of the points. Scatter plots (indeed, all plots that show individual markers) can suffer from overplotting, which means that the graph does not indicate how many observations are at a specific (x, y) location.

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinSeptember 16, 2019 0

The Hull moving average: Implement a custom time series smoother in SAS

A moving average is a statistical technique that is used to smooth a time series. My colleague, Cindy Wang, wrote an article about the Hull moving average (HMA), which is a time series smoother that is sometimes used as a technical indicator by stock market traders. Cindy showed how to

Read More

Data Visualization | Learn SAS

Rick WicklinSeptember 11, 2019 0

Axis tables versus rotated text: How to display a wide table in a small graph

I often use axis tables in PROC SGPLOT in SAS to add a table of text to a graph so that the table values are aligned with the data. But axis tables are not the only way to display tabular data in a graph. You can also use the TEXT

Read More

Data Visualization | Learn SAS

Rick WicklinSeptember 9, 2019 0

Anchor points and rotated text in PROC SGPLOT

The TEXT statement in PROC SGPLOT supports the ROTATE= option to rotate the specified text. It is worth knowing how the ROTATE= option interacts with the POSITION= option, which determines the anchor point at which the text is positioned. Briefly, the text is positioned FIRST, then the rotation occurs. The

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinSeptember 5, 2019 0

Use cosine similarity to make recommendations

When you order an item online, the website often recommends other items based on your purchase. In fact, these kinds of "recommendation engines" contributed to the early success of companies like Amazon and Netflix. SAS uses a recommender engine to suggest articles on the SAS Support Communities. Although recommender engines

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinSeptember 3, 2019 0

Cosine similarity of vectors

An important application of the dot product (inner product) of two vectors is to determine the angle between the vectors. If u and v are two vectors, then cos(θ) = (u ⋅ v) / (|u| |v|) You could apply the inverse cosine function if you wanted to find θ in

Read More

Data Visualization | Programming Tips

Rick WicklinAugust 28, 2019 0

Annotate features of a schematic box plot in SGPLOT

A SAS programmer asked an intriguing question on the SAS Support Communities: Can you use SAS to create a graph that shows how the elements in a box-and-whiskers plot relate to the data? The SAS documentation has several examples that explain how to read a box plot. One of the

Read More

Programming Tips

Rick WicklinAugust 26, 2019 0

Conditionally append observations to a SAS data set

Most SAS programmers know how to use PROC APPEND or the SET statement in DATA step to unconditionally append new observations to an existing data set. However, sometimes you need to scan the data to determine whether or not to append observations. In this situation, many SAS programmers choose one

Read More

Programming Tips

Rick WicklinAugust 21, 2019 0

Two tips for optimizing a function that has a restricted domain

An important application of nonlinear optimization is finding parameters of a model that fit data. For some models, the parameters are constrained by the data. A canonical example is the maximum likelihood estimation of a so-called "threshold parameter" for the three-parameter lognormal distribution. For this distribution, the objective function is

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 19, 2019 0

Timing performance in SAS/IML: Built-in functions versus Base SAS functions

One of my friends likes to remind me that "there is no such thing as a free lunch," which he abbreviates by "TINSTAAFL" (or TANSTAAFL). The TINSTAAFL principle applies to computer programming because you often end up paying a cost (in performance) when you call a convenience function that simplifies

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 14, 2019 0

Short-circuit evaluation and logical ligatures in SAS

Many programmers are familiar with "short-circuit" evaluation in an IF-THEN statement. Short circuit means that a program does not evaluate the remainder of a logical expression if the value of the expression is already logically determined. The SAS DATA step supports short-circuiting for simple logical expressions in IF-THEN statements and

Read More

Analytics

Rick WicklinAugust 12, 2019 0

The math you learned in school: Yes, it’s useful!

What is this math good for, anyway? –Every student, everywhere I am a professional applied mathematician, yet many of the mathematical and statistical techniques that I use every day are not from advanced university courses but are based on simple ideas taught in high school or even in grade school.

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 7, 2019 0

The essential guide to binning in SAS

Do you want to bin a numeric variable into a small number of discrete groups? This article compiles a dozen resources and examples related to binning a continuous variable. The examples show both equal-width binning and quantile binning. In addition to standard one-dimensional techniques, this article also discusses various techniques

Read More

Learn SAS | Machine Learning | Programming Tips

Rick WicklinAugust 5, 2019 0

How to use PROC HPBIN to bin numerical variables

Binning transforms a continuous numerical variable into a discrete variable with a small number of values. When you bin univariate data, you define cut point that define discrete groups. I've previously shown how to use PROC FORMAT in SAS to bin numerical variables and give each group a meaningful name

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 31, 2019 0

Use numeric values for column headers when printing a matrix

Sometimes a little thing can make a big difference. I am enjoying a new enhancement of SAS/IML 15.1, which enables you to use a numeric vector as the column header or row header when you print a SAS/IML matrix. Prior to SAS/IML 15.1, you had to use the CHAR or

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 29, 2019 0

Vectorize the computation of the Mandelbrot set in a matrix language

When my colleague, Robert Allison, blogged about visualizing the Mandelbrot set, I was reminded of a story from the 1980s, which was the height of the fractal craze. A research group in computational mathematics had been awarded a multimillion-dollar grant to purchase a supercomputer. When the supercomputer arrived and got

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 24, 2019 0

Implement the Gumbel distribution in SAS

SAS supports more than 25 common probability distributions for the PDF, CDF, QUANTILE, and RAND functions. Of course, there are infinitely many distributions, so not every possible distribution is supported. If you need a less-common distribution, I've shown how to extend the functionality of Base SAS (by using PROC FCMP)

Read More

Previous 1 … 15 16 17 18 19 … 53 Next