About this blog
Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. This blog focuses on statistical programming. It discusses statistical and computational algorithms, statistical graphics, simulation, efficiency, and data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.
Follow @RickWicklin on Twitter.
Tags9.3 9.4 9.22 12.1 12.3 13.1 13.2 14.1 Bootstrap and Resampling Ciphers Conferences Data Analysis Efficiency File Exchange Getting Started GTL Heat maps History IMLPlus Just for Fun Math Matrix Computations Missing Data Numerical Analysis Optimization Packages pi day R Reading and Writing Data SAS/IML Studio SAS Global Forum SAS Programming Simulation Spatial Data sports analytics Statistical Graphics Statistical Programming Statistical Thinking Strings Time series Tips and Techniques vectorization Video
Subscribe to this blog
Last week I showed how to use the simple bootstrap to randomly resample from the data to create B bootstrap samples, each containing N observations. The simple bootstrap is equivalent to sampling from the empirical cumulative distribution function (ECDF) of the data. An alternative bootstrap technique is called the smooth […]Post a Comment
A common question is "how do I compute a bootstrap confidence interval in SAS?" As a reminder, the bootstrap method consists of the following steps: Compute the statistic of interest for the original data Resample B times from the data to form B bootstrap samples. How you resample depends on […]Post a Comment
Many simulation and resampling tasks use one of four sampling methods. When you draw a random sample from a population, you can sample with or without replacement. At the same time, all individuals in the population might have equal probability of being selected, or some individuals might be more likely […]Post a Comment
How do you sample with replacement in SAS when the probability of choosing each observation varies? I was asked this question recently. The programmer thought he could use PROC SURVEYSELECT to generate the samples, but he wasn't sure which sampling technique he should use to sample with unequal probability. This […]Post a Comment
My colleagues at the SAS & R blog recently posted an example of how to program a permutation test in SAS and R. Their SAS implementation used Base SAS and was "relatively cumbersome" (their words) when compared with the R code. In today's post I implement the permutation test in […]Post a Comment
Bootstrap methods and permutation tests are popular and powerful nonparametric methods for testing hypotheses and approximating the sampling distribution of a statistic. I have described a SAS/IML implementation of a bootstrap permutation test for matched pairs of data (an alternative to a matched-pair t test) in my paper "Modern Data […]Post a Comment
Last week I showed three ways to sample with replacement in SAS. You can use the SAMPLE function in SAS/IML 12.1 to sample from a finite set or you can use the DATA step or PROC SURVEYSELECT to extract a random sample from a SAS data set. Sampling without replacement […]Post a Comment
Randomly choosing a subset of elements is a fundamental operation in statistics and probability. Simple random sampling with replacement is used in bootstrap methods (where the technique is called resampling), permutation tests and simulation. Last week I showed how to use the SAMPLE function in SAS/IML software to sample with […]Post a Comment
With each release of SAS/IML software, the language provides simple ways to carry out tasks that previously required more effort. In 2010 I blogged about a SAS/IML module that appeared in my book Statistical Programming with SAS/IML Software, which was written by using the SAS/IML 9.2. The blog post showed […]Post a Comment
A challenge for statistical programmers is getting data into the right form for analysis. For graphing or analyzing data, sometimes the "wide format" (each subject is represented by one row and many variables) is required, but other times the "long format" (observations for each subject span multiple rows) is more […]Post a Comment