The DO Loop

Prediction regions for a classification problem with two outcomes. Graph created in SAS.

Rick WicklinJuly 17, 2017 0

3 ways to visualize prediction regions for classification problems

An important problem in machine learning is the "classification problem." In this supervised learning problem, you build a statistical model that predicts a set of categorical outcomes (responses) based on a set of input features (explanatory variables). You do this by training the model on data for which the outcomes

English

Advanced Analytics | Programming Tips

Rick WicklinJuly 12, 2017 36

The bias-corrected and accelerated (BCa) bootstrap interval

I recently showed how to compute a bootstrap percentile confidence interval in SAS. The percentile interval is a simple "first-order" interval that is formed from quantiles of the bootstrap distribution. However, it has two limitations. First, it does not use the estimate for the original data; it is based only

English

Programming Tips

Rick WicklinJuly 10, 2017 10

Bootstrap estimates in SAS/IML

I previously wrote about how to compute a bootstrap confidence interval in Base SAS. As a reminder, the bootstrap method consists of the following steps: Compute the statistic of interest for the original data Resample B times from the data to form B bootstrap samples. B is usually a large

English

Analytics | Programming Tips

Rick WicklinJuly 5, 2017 7

Test for the equality of two proportions in SAS

A SAS customer asked how to use SAS to conduct a Z test for the equality of two proportions. He was directed to the SAS Usage Note "Testing the equality of two or more proportions from independent samples." The note says to "specify the CHISQ option in the TABLES statement

English

Programming Tips

t test for difference between group means in SAS

Rick WicklinJuly 3, 2017 6

Summary statistics and t tests in SAS

Students in introductory statistics courses often use summary statistics (such as sample size, mean, and standard deviation) to test hypotheses and to compute confidence intervals. Did you know that you can provide summary statistics (rather than raw data) to PROC TTEST in SAS and obtain hypothesis tests and confidence intervals?

English

Programming Tips

Rick WicklinJune 28, 2017 0

The average bootstrap sample omits 36.8% of the data

Suppose you roll six identical six-sided dice. Chance are that you will see at least one repeated number. The probability that you will see six unique numbers is very small: only 6! / 6^6 ≈ 0.015. This example can be generalized. If you draw a random sample with replacement from

English

Blogs

Blogs

The DO Loop

Follow Us

What is...