Blogs

Blogs

Author

Rick Wicklin

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Analytics | Data Visualization

Rick WicklinDecember 13, 2021 0

A principal component analysis of color palettes

In a previous article, I visualized seven Christmas-themed palettes of colors, as shown to the right. You can see that the palettes include many red, green, and golden colors. Clearly, the colors in the Christmas palettes are not a random sample from the space of RGB colors. Rather, they represent

Read More

Data Visualization | Learn SAS

Rick WicklinDecember 8, 2021 0

Visualize palettes of colors in SAS

In data visualization, colors can represent the values of a variable in a choropleth map, a heatmap, or a scatter plot. But how do you visualize a palette of colors from the RGB or hexadecimal values of the colors? One way is to use the HEATMAPDISC subroutine in SAS/IML, which

Read More

Analytics | Programming Tips

Rick WicklinDecember 6, 2021 0

The expected number of points on a convex hull

While discussing how to compute convex hulls in SAS with a colleague, we wondered how the size of the convex hull compares to the size of the sample. For most distributions of points, I claimed that the size of the convex hull is much less than the size of the

Read More

Analytics | Learn SAS

Rick WicklinDecember 1, 2021 0

Beware of repeated values in loess models

Did you know that the loess regression algorithm is not well-defined when you have repeated values among the explanatory variables, and you request a very small smoothing parameter? This is because loess regression at the point x0 is based on using the k nearest neighbors to x0. If x0 has

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 29, 2021 0

Caslibs and librefs in SAS Viya

When SAS 9 programmers transition to SAS Viya, there are inevitably questions about how new concepts in Cloud Analytic Services (CAS) relate to similar concepts in SAS. This article discusses the question, "What is the difference between a libref and a caslib?" Both are used to access data, but they

Read More

Learn SAS

Rick WicklinNovember 22, 2021 0

What is a CAS-enabled procedure?

I attended a seminar last week whose purpose was to inform SAS 9 programmers about SAS Viya. I could tell from the programmer's questions that some programmers were confused about three basic topics: What are the computing environments in Viya, and how should a programmer think about them? What procedures

Read More

Data Visualization | Programming Tips

Rick WicklinNovember 17, 2021 0

The order of vertices on a convex polygon

In a previous article, I showed how to use theCVEXHULL function in SAS/IML to compute the convex hull of a finite set of planar points. The convex hull is a convex polygon, which is defined by its vertices. To visualize the polygon, you need to know the vertices in sequential

Read More

Data Visualization | Learn SAS | Programming Tips

Rick WicklinNovember 15, 2021 0

Two-dimensional convex hulls in SAS

Given a cloud of points in the plane, it can be useful to identify the convex hull of the points. The convex hull is the smallest convex set that contains the observations. For a finite set of points, it is a convex polygon that has some of the points as

Read More

Data Visualization | Learn SAS

Rick WicklinNovember 10, 2021 0

Create a frequency polygon in SAS

I was recently asked how to create a frequency polygon in SAS. A frequency polygon is an alternative to a histogram that shows similar information about the distribution of univariate data. It is the piecewise linear curve formed by connecting the midpoints of the tops of the bins. The graph

Read More

Analytics | Data Visualization

Rick WicklinNovember 8, 2021 0

The normal approximation and random samples of the binomial distribution

Recall that the binomial distribution is the distribution of the number of successes in a set of independent Bernoulli trials, each having the same probability of success. Most introductory statistics textbooks discuss the approximation of the binomial distribution by the normal distribution. The graph to the right shows that the

Read More

Data Visualization | Learn SAS

Rick WicklinNovember 3, 2021 0

Add reference lines to a bar chart in SAS

A SAS programmer asked whether it is possible to add reference lines to the categorical axis of a bar chart. The answer is yes. You can use the VBAR statement, but I prefer to use the VBARBASIC (or VBARPARM) statement, which enables you to overlay a wide variety of graphs

Read More

Analytics | Learn SAS

Rick WicklinNovember 1, 2021 0

Fit a mixture of Weibull distributions in SAS

A previous article discusses how to use SAS regression procedures to fit a two-parameter Weibull distribution in SAS. The article shows how to convert the regression output into the more familiar scale and shape parameters for the Weibull probability distribution, which are fit by using PROC UNIVARIATE. Although PROC UNIVARIATE

Read More

Analytics | Learn SAS

Rick WicklinOctober 27, 2021 0

Interpret estimates for a Weibull regression model in SAS

It can be frustrating when the same probability distribution has two different parameterizations, but such is the life of a statistical programmer. I previously wrote an article about the gamma distribution, which has two common parameterizations: one that uses a scale parameter (β) and another that uses a rate parameter

Read More

Analytics | Learn SAS

Rick WicklinOctober 20, 2021 0

An introduction to genetic algorithms in SAS

A genetic algorithm (GA) is a heuristic optimization technique. The method tries to mimic natural selection and evolution by starting with a population of random candidates. Candidates are evaluated for "fitness" by plugging them into the objective function. The characteristics of the better candidates are combined to create a new

Read More

Analytics | Programming Tips

Rick WicklinOctober 18, 2021 0

Crossover and mutation: An introduction to two operations in genetic algorithms

This article uses an example to introduce to genetic algorithms (GAs) for optimization. It discusses two operators (mutation and crossover) that are important in implementing a genetic algorithm. It discusses choices that you must make when you implement these operations. Some programmers love using genetic algorithms. Genetic algorithms are heuristic

Read More

Analytics | Programming Tips

Rick WicklinOctober 13, 2021 0

Penalties versus constraints in optimization problems

Sometimes we can learn as much from our mistakes as we do from our successes. Recently, I needed to solve an optimization problem for which the solution vector was a binary vector subject to a constraint. I was in a hurry. Without thinking much about what I was doing, I

Read More

Learn SAS

Rick WicklinOctober 11, 2021 0

The knapsack problem: Binary integer programming in SAS/IML

Many optimization problems in statistics and machine learning involve continuous parameters. For example, maximum likelihood estimation involves optimizing a log-likelihood function over a continuous domain, possibly with constraints. Recently, however, I had to solve an optimization problem for which the solution vector was a 0/1 binary variable. To solve the

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 6, 2021 0

Logical negation of vectors

In a matrix-vector language such as SAS/IML, it is useful to always remember that the fundamental objects are matrices and that all operations are designed to work on matrices. (And vectors, which are matrices that have only one row or one column.) By using matrix operations, you can often eliminate

Read More

Analytics | Learn SAS

Rick WicklinOctober 4, 2021 0

Choose samples with specified statistical properties

A reader asked whether it is possible to find a bootstrap sample that has some desirable properties. I am using the term "bootstrap sample" to refer to the result of randomly resampling with replacement from a data set. Specifically, he wanted to find a bootstrap sample that has a specific

Read More

Data Visualization | Learn SAS

Rick WicklinSeptember 27, 2021 0

Why you should visualize distributions instead of report means

Graphing data is almost always more informative than displaying a table of summary statistics. In a recent article about "dynamite plots," I briefly mentioned that graphs such as box plots and strip plots are better at showing data than graphs that merely show the mean and standard deviation. This article

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinSeptember 22, 2021 0

Use simulations to evaluate the accuracy of asymptotic results

The field of probability and statistics is full of asymptotic results. The Law of Large Numbers and the Central Limit Theorem are two famous examples. An asymptotic result can be both a blessing and a curse. For example, consider a result that says that the distribution of some statistic converges

Read More

Learn SAS | Programming Tips

Rick WicklinSeptember 20, 2021 0

Add an item to a sublist

The SAS/IML language supports lists, which are containers that store other objects, such as matrices and other lists. A primary use of lists is to pack objects of various types into a single symbol that can be passed to and from modules. A useful feature of using lists is that

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinSeptember 15, 2021 0

The partition problem: An optimization approach

I previously wrote about one way to solve the partition problem in SAS. In the partition problem, you divide (or partition) a set of N items into two groups of size k and N-k such that the sum of the items' weights is the same in each group. For example,

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinSeptember 13, 2021 0

The partition problem

The partition problem has many variations, but recently I encountered it as an interactive puzzle on a computer. (Try a similar game yourself!) The player is presented with an old-fashioned pan-balance scale and a set of objects of different weights. The challenge is to divide (or partition) the objects into

Read More

Analytics | Data Visualization | Learn SAS

Rick WicklinSeptember 9, 2021 0

Simulate proportions for groups

A statistical programmer asked how to simulate event-trials data for groups. The subjects in each group have a different probability of experiencing the event. This article describes one way to simulate this scenario. The simulation is similar to simulating from a mixture distribution. This article also shows three different ways

Read More

Data Visualization

Rick WicklinSeptember 7, 2021 0

Remaking a panel of dynamite plots

A colleague spent a lot of time creating a panel of graphs to summarize some data. She did not use SAS software to create the graph, but I used SAS to create a simplified version of her graph, which is shown to the right. (The colors are from her graph.)

Read More

Data Visualization

Rick WicklinSeptember 7, 2021 0

Remaking a panel of dynamite plots

A colleague spent a lot of time creating a panel of graphs to summarize some data. She did not use SAS software to create the graph, but I used SAS to create a simplified version of her graph, which is shown to the right. (The colors are from her graph.)

Read More

Analytics | Programming Tips

Rick WicklinSeptember 1, 2021 0

On the number of bootstrap samples

The number of possible bootstrap samples for a sample of size N is big. Really big. Recall that the bootstrap method is a powerful way to analyze the variation in a statistic. To implement the standard bootstrap method, you generate B random bootstrap samples. A bootstrap sample is a sample

Read More

Analytics | Programming Tips

Rick WicklinAugust 30, 2021 0

Bootstrap correlation coefficients in SAS

You can use the bootstrap method to estimate confidence intervals. Unlike formulas, which assume that the data are drawn from a specified distribution (usually the normal distribution), the bootstrap method does not assume a distribution for the data. There are many articles about how to use SAS to bootstrap statistics

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 25, 2021 0

Convert a symmetric matrix from wide to long form

For graphing multivariate data, it is important to be able to convert the data between "wide form" (a separate column for each variable) and "long form" (which contains an indicator variable that assigns a group to each observation). If the data are numeric, the wide data can be represented as

Read More

Previous 1 … 8 9 10 11 12 … 53 Next