This article shows how to perform an optimization in SAS when the parameters are restricted by nonlinear constraints. In particular, it solves an optimization problem where the parameters are constrained to lie in the annular region between two circles. The end of the article shows the path of partial solutions

# Author

My colleague, Robert Allison, recently published an interesting visualization of the relationship between chess ratings and age. His post was inspired by the article "Age vs Elo — Your battle against time," which was published on the chess.com website. ("Elo" is one of the rating systems in chess.) Robert Allison's

The SGPLOT procedure in SAS makes it easy to create graphs that overlay various groups in the data. Many statements support the GROUP= option, which specifies that the graph should overlay group information. For example, you can create side-by-side bar charts and box plots, and you can overlay multiple scatter

This article shows how to score (evaluate) a quantile regression model on new data. SAS supports several procedures for quantile regression, including the QUANTREG, QUANTSELECT, and HPQUANTSELECT procedures. The first two procedures do not support any of the modern methods for scoring regression models, so you must use the "missing

When you use a regression procedure in SAS that supports variable selection (GLMSELECT or QUANTSELECT), did you know that the procedures automatically produce a macro variable that contains the names of the selected variables? This article provides examples and details. A previous article provides an overview of the 'SELECT' procedures

A programmer recently asked a question on a SAS discussion forum about design matrices for categorical variables. He had generated a design matrix by using PROC GLMMOD and wanted to use the design columns in a subsequent procedure. However, the columns were named COL1, COL2, COL3,..., so he couldn't tell

Back in SAS 9.3M2 (SAS/STAT 12.1), PROC FREQ introduced mosaic plots to visualize the joint frequencies in a contingency table. By default, the cells in a mosaic plot are colored according to levels of one of the categorical variables in the analysis. However, in 2013 I showed how you can

Since the late 1990s, SAS has supplied macros for basic bootstrap and jackknife analyses. This article provides an example that shows how to use the %BOOT and %BOOTCI macros. The %BOOT macro generates a bootstrap distribution and computes basic statistics about the bootstrap distribution, including estimates of bias, standard error,

This article shows how to implement balanced bootstrap sampling in SAS. The basic bootstrap samples with replacement from the original data (N observations) to obtain B new samples. This is called "uniform" resampling because each observation has a uniform probability of 1/N of being selected at each step of the

My colleague Robert Allison recently blogged about using the diameter of Texas as a unit of measurement. The largest distance across Texas is about 801 miles, so Robert wanted to find the set of all points such that the distance from the point to Texas is less than or equal

In a previous article, I showed how to find the intersection (if it exists) between two line segments in the plane. There are some fun problems in probability theory that involve intersections of line segments. One is "What is the probability that two randomly chosen chords of a circle intersect?"

Back in high school, you probably learned to find the intersection of two lines in the plane. The intersection requires solving a system of two linear equations. There are three cases: (1) the lines intersect in a unique point, (2) the lines are parallel and do not intersect, or (3)

SAS enables you to evaluate a regression model at any location within the range of the data. However, sometimes you might be interested in how the predicted response is increasing or decreasing at specified locations. You can use finite differences to compute the slope (first derivative) of a regression model.

Which president of the United States is ranked the greatest by presidential historians? This article visualizes the results of the 2018 Presidential Greatness Survey, which was created and administered by B. Rottinghaus and J. Vaughn. They analyzed 166 responses from experts in political science who ranked the 44 US presidents

This article describes how to obtain an initial guess for nonlinear regression models, especially nonlinear mixed models. The technique is to first fit a simpler fixed-effects model by replacing the random effects with their expected values. The parameter estimates for the fixed-effects model are often good initial guesses for the

When you fit nonlinear fixed-effect or mixed models, it is difficult to guess the model parameters that fit the data. Yet, most nonlinear regression procedures (such as PROC NLIN and PROC NLMIXED in SAS) require that you provide a good guess! If your guess is not good, the fitting algorithm,

A previous article provides an example of using the BOOTSTRAP statement in PROC TTEST to compute bootstrap estimates of statistics in a two-sample t test. The BOOTSTRAP statement is new in SAS/STAT 14.3 (SAS 9.4M5). However, you can perform the same bootstrap analysis in earlier releases of SAS by using

Bootstrap resampling is a powerful way to estimate the standard error for a statistic without making any parametric assumptions about its sampling distribution. The bootstrap method is often implemented by using a sequence of calls to resample from the data, compute a statistic on each sample, and analyze the bootstrap

My 2018 SAS Global Forum paper was about "how to use the random-number generators (RNGs) in SAS." You can read the paper for details, but I recently recorded a short video that summarizes the main ideas in the paper. In particular, the video gives an overview of the new RNGs

If you use PROC SGPLOT to create ODS graphics, "ATTRS" are everywhere. ATTRS is an abbreviation of "attributes." Most options that change the attributes of a graphical element end with the ATTRS suffix. For example, the MARKERATTRS option modifies attributes of markers, the LINEATTRS option modifies attributes of lines, and

In SAS, the reserved keyword _NULL_ specifies a SAS data set that has no observations and no variables. When you specify _NULL_ as the name of an output data set, the output is not written. The _NULL_ data set is often used when you want to execute DATA step code

I recently recorded a short video about the new syntax for specifying and manipulating lists in SAS/IML 14.3. This is a video of my Super Demo at SAS Global Forum 2018. The new syntax supports dynamic arrays, associative arrays ("named lists"), and hierarchical data structures such as lists of lists.

The SURVEYSELECT procedure in SAS 9.4M5 supports the OUTRANDOM option, which causes the selected items in a simple random sample to be randomly permuted after they are selected. This article describes several statistical tasks that benefit from this option, including simulating card games, randomly permuting observations in a DATA step,

Years ago, I wrote an article about how to create a Top 10 table and bar chart. The program can be trivially modified to create a "Top N" table and plot, such as Top 5, Top 20, or even Top 100. Not long after the article was written, the developer

A previous article showed how to use a calibration plot to visualize the goodness-of-fit for a logistic regression model. It is common to overlay a scatter plot of the binary response on a predicted probability plot (below, left) and on a calibration plot (below, right): The SAS program that creates

The SAS language provides syntax that enables you to quickly specify a list of variables. SAS statements that accept variable lists include the KEEP and DROP statements, the ARRAY statement, and the OF operator for comma-separated arguments to some functions. You can also use variable lists on the VAR statements

This article shows how to construct a butterfly plot in SAS. A butterfly plot (also called a butterfly chart) is a comparative bar chart or histogram that displays the distribution of a variable for two subpopulations. A butterfly plot for the cholesterol readings of 5,057 patients in a medical study

In a recent blog post, Chris Hemedinger used a scatter plot to show the result of 100 coin tosses. Chris arranged the 100 results in a 10 x 10 grid, where the first 10 results were shown on the first row, the second 10 were shown on the second row, and so

In my article about how to construct calibration plots for logistic regression models in SAS, I mentioned that there are several popular variations of the calibration plot. The previous article showed how to construct a loess-based calibration curve. Austin and Steyerberg (2013) recommend the loess-based curve on the basis of

A logistic regression model is a way to predict the probability of a binary response based on values of explanatory variables. It is important to be able to assess the accuracy of a predictive model. This article shows how to construct a calibration plot in SAS. A calibration plot is