Blogs

Blogs

Author

Rick Wicklin

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Rick WicklinOctober 6, 2014 0

Convert hexadecimal colors to RGB

In response to my recent post about how to use the PALETTE function in SAS/IML to generate color ramps, a reader wrote the following: The PALETTE function returns an array of hexadecimal values such as CXF03B20. For those of us who think about colors as RGB values, is there an

Read More

Rick WicklinOctober 3, 2014 0

Which double letters appear most frequently in English text?

Double, double toil and trouble; Fire burn, and caldron bubble. Macbeth, Act IV, Scene I For the cyptanalyst or recreational puzzle solver, "double double" does not lead to toil or trouble. Just the opposite: The occurrence of a double-letter bigram in an enciphered word puzzle is quite fortunate. Certain double

Read More

Rick WicklinOctober 1, 2014 0

How to choose colors for maps and heat maps

Have you ever looked as a statistical graph that uses bright garish colors and thought, "Why in the world did that guy choose those awful colors?" Don't be "that guy"! Your choice of colors for a graph can make a huge difference in how well your visualization is perceived by

Read More

Rick WicklinSeptember 29, 2014 0

Create discrete heat maps in SAS/IML

In a previous article I introduced the HEATMAPCONT subroutine in SAS/IML 13.1, which makes it easy to visualize matrices by using heat maps with continuous color ramps. This article introduces a companion subroutine. The HEATMAPDISC subroutine, which also requires SAS/IML 13.1, is designed to visualize matrices that have a small

Read More

Rick WicklinSeptember 26, 2014 0

The frequency of bigrams in an English corpus

In last week's article about the distribution of letters in an English corpus, I presented research results by Peter Norvig who used Google's digitized library and tabulated the frequency of each letter. Norvig also tabulated the frequency of bigrams, which are pairs of letters that appear consecutively within a word.

Read More

Rick WicklinSeptember 24, 2014 0

Designing a quantile bin plot

While at JSM 2014 in Boston, a statistician asked me whether it was possible to create a "customized bin plot" in SAS. When I asked for more information, she told me that she has a large data set. She wants to visualize the data, but a scatter plot is not

Read More

Rick WicklinSeptember 22, 2014 0

Skew this

The skewness of a distribution indicates whether a distribution is symmetric or not. A distribution that is symmetric about its mean has zero skewness. In contrast, if the right tail of a unimodal distribution has more mass than the left tail, then the distribution is said to be "right skewed"

Read More

Rick WicklinSeptember 19, 2014 0

The frequency of letters in an English corpus

It's time for another blog post about ciphers. As I indicated in my previous blog post about substitution ciphers, the classical substitution cipher is no longer used to encrypt ultra-secret messages because the enciphered text is prone to a type of statistical attack known as frequency analysis. At the root

Read More

Rick WicklinSeptember 17, 2014 0

Read from one data set and write to another with SAS/IML

Many people know that the SAS/IML language enables you to read data from and write results to multiple SAS data sets. When you open a new data set, it is a good programming practice to close the previous data set. But did you know that you can have two data

Read More

Rick WicklinSeptember 15, 2014 0

Handling run-time errors in user-defined modules

I received the following email from a SAS/IML programmer: I am getting an error in a PROC IML module that I wrote. The SAS Log says NOTE: Paused in module NAME When I submit other commands, PROC IML doesn't seem to understand them. How can I continue the program? The

Read More

Rick WicklinSeptember 10, 2014 0

An exploratory technique for visualizing the distributions of 100 variables

In a previous blog post I showed how to order a set of variables by a statistic. After reshaping data, you can create a graph that contains box plots for many variables. Ordering the variables by some statistic (mean, median, variance,...) helps to differentiate and distinguish the variables. You can

Read More

Rick WicklinSeptember 8, 2014 0

Order variables by values of a statistic

When I create a graph of data that contains a categorical variable, I rarely want to display the categories in alphabetical order. For example, the box plot to the left is a plot of 10 standardized variables where the variables are ordered by their median value. The ordering makes it

Read More

Rick WicklinSeptember 4, 2014 0

Ciphers, keys, and cryptoquotes

Today is my fourth blog-iversary: the anniversary of my first blog post in 2010. To celebrate, I am going to write a series of fun posts based on The Code Book by Simon Singh, a fascinating account of the history of cryptography from ancient times until the present. While reading

Read More

Rick WicklinSeptember 2, 2014 0

How to create a hexagonal bin plot in SAS

While I was working on my recent blog post about two-dimensional binning, a colleague asked whether I would be discussing "the new hexagonal binning method that was added to the SURVEYREG procedure in SAS/STAT 13.2." I was intrigued: I was not aware that hexagonal binning had been added to a

Read More

Rick WicklinAugust 27, 2014 0

Counting observations in two-dimensional bins

Last Monday I discussed how to choose the bin width and location for a histogram in SAS. The height of each histogram bar shows the number of observations in each bin. Although my recent article didn't mention it, you can also use the IML procedure to count the number of

Read More

Learn SAS

Rick WicklinAugust 25, 2014 0

Choosing bins for histograms in SAS

When you create a histogram with statistical software, the software uses the data (including the sample size) to automatically choose the width and location of the histogram bins. The resulting histogram is an attempt to balance statistical considerations, such as estimating the underlying density, and "human considerations," such as choosing

Read More

Rick WicklinAugust 22, 2014 0

Analyzing activity-tracker data: How many steps per day do YOU take?

My wife got one of those electronic activity trackers a few months ago and has been diligently walking every day since then. At the end of the day she sometimes reads off how many steps she walked, as measured by her activity tracker. I am always impressed at how many

Read More

Rick WicklinAugust 20, 2014 0

Creating heat maps in SAS/IML

In a previous blog post, I showed how to use the graph template language (GTL) in SAS to create heat maps with a continuous color ramp. SAS/IML 13.1 includes the HEATMAPCONT subroutine, which makes it easy to create heat maps with continuous color ramps from SAS/IML matrices. Typical usage includes

Read More

Rick WicklinAugust 18, 2014 0

Creating a basic heat map in SAS

Heat maps have many uses. In a previous article, I showed how to use heat maps with a discrete color ramp to visualize matrices that have a small number of unique values, such as certain covariance matrices and sparse matrices. You can also use heat maps with a continuous color

Read More

Rick WicklinAugust 13, 2014 0

Guiding numerical integration: The PEAK= option in the SAS/IML QUAD subroutine

One of the things I enjoy about blogging is that I often learn something new. Last week I wrote about how to optimize a function that is defined in terms of an integral. While developing the program in the article, I made some mistakes that generated SAS/IML error messages. By

Read More

Learn SAS

Rick WicklinAugust 11, 2014 0

Tips for learning the SAS/IML language

A SAS customer wrote, "I have access to PROC IML through SAS OnDemand for Academics. What is the best way for me to learn to program in the SAS/IML language? How do I get started with PROC IML?" That is an excellent question, and I'm happy to offer some suggestions.

Read More

Rick WicklinAugust 6, 2014 0

Define an objective function that evaluates an integral in SAS

The SAS/IML language is used for many kinds of computations, but three important numerical tasks are integration, optimization, and root finding. Recently a SAS customer asked for help with a problem that involved all three tasks. The customer had an objective function that was defined in terms of an integral.

Read More

Rick WicklinAugust 5, 2014 0

Stigler's seven pillars of statistical wisdom

Wisdom has built her house; She has hewn out her seven pillars. – Proverbs 9:1 At the 2014 Joint Statistical Meetings in Boston, Stephen Stigler gave the ASA President's Invited Address. In forty short minutes, Stigler laid out his response to the age-old question "What is statistics?" His answer was

Read More

Rick WicklinAugust 4, 2014 0

Reversing the limits of integration in SAS

In SAS software, you can use the QUAD subroutine in the SAS/IML language to evaluate definite integrals on an interval [a, b]. The integral is properly defined only for a < b, but mathematicians define the following convention, which enables you to make sense of reversing the limits of integration:

Read More

Learn SAS

Rick WicklinJuly 30, 2014 0

Overview of new features in SAS/IML 12.3

Unless you diligently read the "What's New" chapter for each release of SAS software, it is easy to miss new features that appear in the language. People who have been writing SAS/IML programs for decades are sometimes surprised when I tell them about a useful new function or programming feature.

Read More

Advanced Analytics

Rick WicklinJuly 28, 2014 0

Lexicographic combinations in SAS

In a previous blog post, I described how to generate combinations in SAS by using the ALLCOMB function in SAS/IML software. The ALLCOMB function in Base SAS is the equivalent function for DATA step programmers. Recall that a combination is a unique arrangement of k elements chosen from a set

Read More

Rick WicklinJuly 23, 2014 0

Computing prediction ellipses from a covariance matrix

In a previous blog post, I showed how to overlay a prediction ellipse on a scatter plot in SAS by using the ELLIPSE statement in PROC SGPLOT. The ELLIPSE statement draws the ellipse by using a standard technique that assumes the sample is bivariate normal. Today's article describes the technique

Read More

Rick WicklinJuly 21, 2014 0

Add a prediction ellipse to a scatter plot in SAS

It is common in statistical graphics to overlay a prediction ellipse on a scatter plot. This article describes two easy ways to overlay prediction ellipses on a scatter plot by using SAS software. It also describes how to overlay multiple prediction ellipses for subpopulations. What is a prediction ellipse? A

Read More

Rick WicklinJuly 18, 2014 0

How to create and detect an empty matrix

An empty matrix is a matrix that has zero rows and zero columns. At first "empty matrix" sounds like an oxymoron, but when programming in a matrix language such as SAS/IML, empty matrices arise surprisingly often. Sometimes empty matrices occur because of a typographical error in your program. If you

Read More

Advanced Analytics | Learn SAS | Programming Tips

Rick WicklinJuly 16, 2014 0

How to share SAS/IML programs with the world

Have you written a SAS/IML program that you think is particularly clever? Are you the proud author of SAS/IML functions that extend the functionality of SAS software? You've worked hard to develop, debug, and test your program, so why not share it with others? There is now a location for

Read More

Previous 1 … 33 34 35 36 37 … 53 Next