Blogs

Blogs

Tag: Statistical Programming

Programming Tips

Rick WicklinSeptember 26, 2018 0

Radial basis functions and Gaussian kernels in SAS

A radial basis function is a scalar function that depends on the distance to some point, called the center point, c. One popular radial basis function is the Gaussian kernel φ(x; c) = exp(-||x – c||2 / (2 σ2)), which uses the squared distance from a vector x to the

Read More

Analytics | Data Visualization

Rick WicklinSeptember 19, 2018 0

Shuffling smackdown: Overhand shuffle versus riffle shuffle

Every day I’m shufflin'. Shufflin', shufflin'. -- "Party Rock Anthem," LMFAO The most popular way to mix a deck of cards is the riffle shuffle, which separates the deck into two pieces and interleaves the cards from each piece. Besides being popular with card players, the riffle shuffle is

Read More

Analytics

Rick WicklinSeptember 12, 2018 0

Two interfaces for typing text by using a TV remote control

Have you ever tried to type a movie title by using a TV remote control? Both Netflix and Amazon Video provide an interface (a virtual keyboard) that enables you to use the four arrow keys of a standard remote control to type letters. The letters are arranged in a regular

Read More

Programming Tips

Visualization of L1 distance matrix for items arranged on a 6 x 6 grid

Rick WicklinSeptember 10, 2018 0

Distances on rectangular grids

Given a rectangular grid with unit spacing, what is the expected distance between two random vertices, where distance is measured in the L1 metric? (Here "random" means "uniformly at random.") I recently needed this answer for some small grids, such as the one to the right, which is a 7 x 6

Read More

Programming Tips

Rick WicklinSeptember 4, 2018 0

Store vectors of different lengths in a matrix

In the SAS/IML language, you can only concatenate vectors that have conforming dimensions. For example, to horizontally concatenate two vectors X and Y, the symbols X and Y must have the same number of rows. If not, the statement Z = X || Y will produce an error: ERROR: Matrices

Read More

Analytics

Rick WicklinAugust 29, 2018 0

Kernel regression in SAS

A SAS programmer recently asked me how to compute a kernel regression in SAS. He had read my blog posts "What is loess regression" and "Loess regression in SAS/IML" and was trying to implement a kernel regression in SAS/IML as part of a larger analysis. This article explains how to

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 22, 2018 0

Standardized regression coefficients

A SAS programmer recently asked how to interpret the "standardized regression coefficients" as computed by the STB option on the MODEL statement in PROC REG and other SAS regression procedures. The SAS documentation for the STB option states, "a standardized regression coefficient is computed by dividing a parameter estimate by

Read More

Programming Tips

Rick WicklinAugust 20, 2018 0

Calculators killed the standard statistical table

Video killed the radio star.... We can't rewind, we've gone too far. -- The Buggles (1979) "You kids have it easy," my father used to tell me. "When I was a kid, I didn't have all the conveniences you have today." He's right, and I could say the same

Read More

Data Visualization | Programming Tips

Rick WicklinAugust 8, 2018 0

Plot curves for levels of two categorical variables in SAS

The SGPLOT procedure in SAS makes it easy to create graphs that overlay various groups in the data. Many statements support the GROUP= option, which specifies that the graph should overlay group information. For example, you can create side-by-side bar charts and box plots, and you can overlay multiple scatter

Read More

Analytics | Programming Tips

Rick WicklinAugust 6, 2018 0

How to score and graph a quantile regression model in SAS

This article shows how to score (evaluate) a quantile regression model on new data. SAS supports several procedures for quantile regression, including the QUANTREG, QUANTSELECT, and HPQUANTSELECT procedures. The first two procedures do not support any of the modern methods for scoring regression models, so you must use the "missing

Read More

Analytics | Learn SAS

Rick WicklinAugust 1, 2018 0

Which variables are in the final selected model?

When you use a regression procedure in SAS that supports variable selection (GLMSELECT or QUANTSELECT), did you know that the procedures automatically produce a macro variable that contains the names of the selected variables? This article provides examples and details. A previous article provides an overview of the 'SELECT' procedures

Read More

Analytics | Programming Tips

Rick WicklinJune 27, 2018 0

Reduced models: A way to choose initial parameters for a mixed model

This article describes how to obtain an initial guess for nonlinear regression models, especially nonlinear mixed models. The technique is to first fit a simpler fixed-effects model by replacing the random effects with their expected values. The parameter estimates for the fixed-effects model are often good initial guesses for the

Read More

Analytics

Rick WicklinJune 25, 2018 0

Use a grid search to find initial parameter values for regression models in SAS

When you fit nonlinear fixed-effect or mixed models, it is difficult to guess the model parameters that fit the data. Yet, most nonlinear regression procedures (such as PROC NLIN and PROC NLMIXED in SAS) require that you provide a good guess! If your guess is not good, the fitting algorithm,

Read More

Analytics | Programming Tips

Rick WicklinJune 20, 2018 0

The bootstrap method in SAS: A t test example

A previous article provides an example of using the BOOTSTRAP statement in PROC TTEST to compute bootstrap estimates of statistics in a two-sample t test. The BOOTSTRAP statement is new in SAS/STAT 14.3 (SAS 9.4M5). However, you can perform the same bootstrap analysis in earlier releases of SAS by using

Read More

Learn SAS | Programming Tips

Rick WicklinJune 8, 2018 0

Video: A new syntax for lists in SAS/IML

I recently recorded a short video about the new syntax for specifying and manipulating lists in SAS/IML 14.3. This is a video of my Super Demo at SAS Global Forum 2018. The new syntax supports dynamic arrays, associative arrays ("named lists"), and hierarchical data structures such as lists of lists.

Read More

Programming Tips

Rick WicklinApril 16, 2018 0

Random permutations without duplicates

A colleague and I recently discussed how to generate random permutations without encountering duplicates. Given a set of n items, there are n! permutations My colleague wants to generate k unique permutations at random from among the total of n!. Said differently, he wants to sample without replacement from the

Read More

Programming Tips

Rick WicklinApril 4, 2018 0

Distance correlation

Correlation is a statistic that measures how closely two variables are related to each other. The most popular definition of correlation is the Pearson product-moment correlation, which is a measurement of the linear relationship between two variables. Many textbooks stress the linear nature of the Pearson correlation and emphasize that

Read More

Analytics | Data Visualization | Programming Tips

Euclidean and L1 distances between observations and a target value for standardized data

Rick WicklinMarch 28, 2018 0

Find the distances between observations and a target value

Suppose you want to find observations in multivariate data that are closest to a numerical target value. For example, for the students in the Sashelp.Class data set, you might want to find the students whose (Age, Height, Weight) values are closest to the triplet (13, 62, 100). The way to

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 19, 2018 0

Compute with combinations: Maximize a function over combinations of variables

About once a month I see a question on the SAS Support Communities that involves what I like to call "computations with combinations." A typical question asks how to find k values (from a set of p values) that maximize or minimize some function, such as "I have 5 variables,

Read More

Analytics | Programming Tips

Rick WicklinMarch 7, 2018 0

Fit a distribution from quantiles

Data analysts often fit a probability distribution to data. When you have access to the data, a common technique is to use maximum likelihood estimation (MLE) to compute the parameters of a distribution that are "most likely" to have produced the observed data. However, how can you fit a distribution

Read More

Advanced Analytics

Sample from mixture distribution showing sample median

Rick WicklinFebruary 21, 2018 0

A Monte Carlo algorithm to estimate a median

This article describes and implements a fast algorithm that estimates a median for very large samples. The traditional median estimate sorts a sample of size N and returns the middle value (when N is odd). The algorithm in this article uses Monte Carlo techniques to estimate the median much faster.

Read More

Analytics | Programming Tips

Quantiles are the solutions to the equation CDF(x)-p=0, where p is a probability

Rick WicklinFebruary 19, 2018 0

Compute the quantiles of any distribution

Your statistical software probably provides a function that computes quantiles of common probability distributions such as the normal, exponential, and beta distributions. Because there are infinitely many probability distributions, you might encounter a distribution for which a built-in quantile function is not implemented. No problem! This article shows how to

Read More

Programming Tips

Rick WicklinJanuary 24, 2018 0

Use lists to pass parameters to SAS/IML functions

A popular way to use lists in the SAS/IML language is to pack together several related matrices into a single data structure that can be passed to a function. Imagine that you have written an algorithm that requires a dozen different parameters. Historically, you would have to pass those parameters

Read More

Programming Tips

Rick WicklinJanuary 22, 2018 0

Create lists by using a natural syntax in SAS/IML

SAS/IML 14.3 (SAS 9.4M5) introduced a new syntax for creating lists and for assigning and extracting item in a list. Lists (introduced in SAS/IML 14.2) are data structures that are convenient for holding heterogeneous data. A single list can hold character matrices, numeric matrices, scalar values, and other lists, as

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 10, 2018 0

10 posts from 2017 that deserve a second look

Last week I wrote about the 10 most popular articles from The DO Loop in 2017. My most popular articles tend to be about elementary statistics or SAS programming tips. Less popular are the articles about advanced statistical and programming techniques. However, these technical articles fill an important niche. Not

Read More

Programming Tips

Histogram of data overlaid with a beta density curve, fitted by maximum likelihood estimation

Rick WicklinNovember 27, 2017 0

The method of moments: A smart way to choose initial parameters for MLE

When you run an optimization, it is often not clear how to provide the optimization algorithm with an initial guess for the parameters. A good guess converges quickly to the optimal solution whereas a bad guess might diverge or require many iterations to converge. Many people use a default value

Read More

Programming Tips

Beta-binomial cumulative distribution

Rick WicklinNovember 22, 2017 0

Compute the CDF and quantiles of discrete distributions

A statistical programmer read my article about the beta-binomial distribution and wanted to know how to compute the cumulative distribution (CDF) and the quantile function for this distribution. In general, if you know the PDF for a discrete distribution, you can also compute the CDF and quantile functions. This article

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 15, 2017 0

Catch run-time errors in SAS/IML programs

Did you know that a SAS/IML function can recover from a run-time error? You can specify how to handle run-time errors by using a programming technique that is similar to the modern "try-catch" technique, although the SAS/IML technique is an older implementation. Preventing errors versus handling errors In general, SAS/IML

Read More

Programming Tips

The PAUSE statement as a debugging tool in SAS/IML Studio

Rick WicklinNovember 13, 2017 0

A tip for debugging SAS/IML modules: The PAUSE statement

Debugging is the bane of every programmer. SAS supports a DATA step debugger, but that debugger can't be used for debugging SAS/IML programs. In lieu of a formal debugger, many SAS/IML programmers resort to inserting multiple PRINT statements into a function definition. However, there is an easier way to query

Read More

Analytics | Learn SAS

Principal component regression in SAS: Loadings plot

Rick WicklinOctober 23, 2017 0

Principal component regression in SAS

A common question on discussion forums is how to compute a principal component regression in SAS. One reason people give for wanting to run a principal component regression is that the explanatory variables in the model are highly correlated which each other, a condition known as multicollinearity. Although principal component

Read More

Previous 1 … 5 6 7 8 9 … 15 Next