The EFFECT statement is supported by more than a dozen SAS/STAT regression procedures. Among other things, it enables you to generate spline effects that you can use to fit nonlinear relationships in data. Recently there was a discussion on the SAS Support Communities about how to interpret the parameter estimates

# Author

I recently wrote about how to use PROC TTEST in SAS/STAT software to compute the geometric mean and related statistics. This prompted a SAS programmer to ask a related question. Suppose you have dozens (or hundreds) of variables and you want to compute the geometric mean of each. What is

In a previous article, I mentioned that the VLINE statement in PROC SGPLOT is an easy way to graph the mean response at a set of discrete time points. I mentioned that you can choose three options for the length of the "error bars": the standard deviation of the data,

It is always great to read an old paper or blog post and think, "This task is so much easier in SAS 9.4!" I had that thought recently when I stumbled on a 2007 paper by Wei Cheng titled "Graphical Representation of Mean Measurement over Time." A substantial portion of

I frequently see questions on SAS discussion forums about how to compute the geometric mean and related quantities in SAS. Unfortunately, the answers to these questions are sometimes confusing or even wrong. In addition, some published papers and web sites that claim to show how to calculate the geometric mean

There are several different kinds of means. They all try to find an average value from among a set of numbers. Although the most popular mean is the arithmetic mean, the geometric mean can be useful for problems in statistics, finance, and biology. A common application of the geometric mean

One of the strengths of the SAS/IML language is its flexibility. Recently, a SAS programmer asked how to generalize a program in a previous article. The original program solved one optimization problem. The reader said that she wants to solve this type of problem 300 times, each time using a

Although I do not typically blog about undocumented SAS options, I'll make an exception this time. For many years, I have known that the CONTENTS and COMPARE procedures support the BRIEF and SHORT options, but I always forget which option goes with which procedure. For the record, here are the

In a scatter plot that displays many points, it can be important to visualize the density of the points. Scatter plots (indeed, all plots that show individual markers) can suffer from overplotting, which means that the graph does not indicate how many observations are at a specific (x, y) location.

A moving average is a statistical technique that is used to smooth a time series. My colleague, Cindy Wang, wrote an article about the Hull moving average (HMA), which is a time series smoother that is sometimes used as a technical indicator by stock market traders. Cindy showed how to

I often use axis tables in PROC SGPLOT in SAS to add a table of text to a graph so that the table values are aligned with the data. But axis tables are not the only way to display tabular data in a graph. You can also use the TEXT

The TEXT statement in PROC SGPLOT supports the ROTATE= option to rotate the specified text. It is worth knowing how the ROTATE= option interacts with the POSITION= option, which determines the anchor point at which the text is positioned. Briefly, the text is positioned FIRST, then the rotation occurs. The

When you order an item online, the website often recommends other items based on your purchase. In fact, these kinds of "recommendation engines" contributed to the early success of companies like Amazon and Netflix. SAS uses a recommender engine to suggest articles on the SAS Support Communities. Although recommender engines

An important application of the dot product (inner product) of two vectors is to determine the angle between the vectors. If u and v are two vectors, then cos(θ) = (u ⋅ v) / (|u| |v|) You could apply the inverse cosine function if you wanted to find θ in

A SAS programmer asked an intriguing question on the SAS Support Communities: Can you use SAS to create a graph that shows how the elements in a box-and-whiskers plot relate to the data? The SAS documentation has several examples that explain how to read a box plot. One of the

Most SAS programmers know how to use PROC APPEND or the SET statement in DATA step to unconditionally append new observations to an existing data set. However, sometimes you need to scan the data to determine whether or not to append observations. In this situation, many SAS programmers choose one

An important application of nonlinear optimization is finding parameters of a model that fit data. For some models, the parameters are constrained by the data. A canonical example is the maximum likelihood estimation of a so-called "threshold parameter" for the three-parameter lognormal distribution. For this distribution, the objective function is

One of my friends likes to remind me that "there is no such thing as a free lunch," which he abbreviates by "TINSTAAFL" (or TANSTAAFL). The TINSTAAFL principle applies to computer programming because you often end up paying a cost (in performance) when you call a convenience function that simplifies

Many programmers are familiar with "short-circuit" evaluation in an IF-THEN statement. Short circuit means that a program does not evaluate the remainder of a logical expression if the value of the expression is already logically determined. The SAS DATA step supports short-circuiting for simple logical expressions in IF-THEN statements and

What is this math good for, anyway? –Every student, everywhere I am a professional applied mathematician, yet many of the mathematical and statistical techniques that I use every day are not from advanced university courses but are based on simple ideas taught in high school or even in grade school.

Do you want to bin a numeric variable into a small number of discrete groups? This article compiles a dozen resources and examples related to binning a continuous variable. The examples show both equal-width binning and quantile binning. In addition to standard one-dimensional techniques, this article also discusses various techniques

Binning transforms a continuous numerical variable into a discrete variable with a small number of values. When you bin univariate data, you define cut point that define discrete groups. I've previously shown how to use PROC FORMAT in SAS to bin numerical variables and give each group a meaningful name

Sometimes a little thing can make a big difference. I am enjoying a new enhancement of SAS/IML 15.1, which enables you to use a numeric vector as the column header or row header when you print a SAS/IML matrix. Prior to SAS/IML 15.1, you had to use the CHAR or

When my colleague, Robert Allison, blogged about visualizing the Mandelbrot set, I was reminded of a story from the 1980s, which was the height of the fractal craze. A research group in computational mathematics had been awarded a multimillion-dollar grant to purchase a supercomputer. When the supercomputer arrived and got

SAS supports more than 25 common probability distributions for the PDF, CDF, QUANTILE, and RAND functions. Of course, there are infinitely many distributions, so not every possible distribution is supported. If you need a less-common distribution, I've shown how to extend the functionality of Base SAS (by using PROC FCMP)

Is 4 an extreme value for the standard normal distribution? In high school, students learn the famous 68-95-99.7 rule, which is a way to remember that 99.7 percent of random observation from a normal distribution are within three standard deviations from the mean. For the standard normal distribution, the probability

In the SAS/IML language, a matrix contains data of one type: numeric or character. If you want to create a SAS data set that contains mixed-type data (numeric and character), SAS/IML 15.1 provides support to write multiple matrices to a data set by using a single statement. Specifically, the CREATE

Heat maps have many uses. You can use a heat map to visualize correlation matrices, to visualize longitudinal data ("lasagna plots"), and to visualize counts in any two-dimensional table. As of SAS 9.4m3, you can create heat maps in SAS by using the HEATMAP and HEATMAPPARM statements in PROC SGPLOT.

I recently showed how to create an annotation data set that will overlay cell counts or percentages on a mosaic plot. A mosaic plot is a visual representation of a cross-tabulation of observed frequencies for two categorical variables. The mosaic plot with cell counts is shown to the right. The

The mosaic plot is a graphical visualization of a frequency table. In previous articles, I showed how to create a mosaic plot in SAS by using PROC FREQ and how to define a template in the Graph Template Language (GTL) by using the MOSAICPARM statement. This article shows how to