Today we celebrate The Do Loop!

2

Rick Wicklin, author of The Do Loop

By almost any measure, Rick Wicklin is the most prolific and most popular blogger at SAS. Author of The Do Loop, Rick has been writing and publishing blogs at SAS since 2010, and today he publishes his 1,000th post on The Do Loop!

To celebrate, I want to highlight some of the posts from the history of The Do Loop.

Let's start by looking at Rick's very first post. I think you'll appreciate the programmer introduction.

Hello, World!

When programmers begin learning a new computer language, the first program they write is often one that prints the text “Hello, World!” Successfully writing a Hello World program assures the programmer that the software is successfully installed and that all necessary features are working: parsers, compilers, linkers, and so on. In fact, long before viral videos ruled the internet, many of us circulated a viral email that describes the evolution of a programmer in terms of Hello World programs.

 

Next let's look at the post that continues to be the most popular post on The Do Loop every year. Naturally, it concerns looping and how to loop in SAS.

Loops in SAS

Looping is essential to statistical programming. Whether you need to iterate over parameters in an algorithm or indices in an array, a loop is often one of the first programming constructs that a beginning programmer learns. Today is the first anniversary of this blog, which is named The DO Loop, so it seems appropriate to blog about DO loops in SAS. I'll describe looping in the SAS DATA step and compare it with looping in the SAS/IML language.

 

What about the most commented on post? You might want to set aside some time to read all 170+ comments on this one.

Log transformations: How to handle negative data values?

The log transformation is one of the most useful transformations in data analysis. It is used as a transformation to normality and as a variance stabilizing transformation. A log transformation is often used as part of exploratory data analysis in order to visualize (and later model) data that ranges over several orders of magnitude. Common examples include data on income, revenue, populations of cities, sizes of things, weights of things, and so forth. (Remember, however, that you do not have to transform variables! Some people mistakenly believe that linear regression requires normally distributed variables. It does not!)

 

My next few categories are getting a bit subjective, but I think this one is possibly the most popular nerdy math post, and it tells a great story with great visuals.

The spiral of splatter

"Daddy, help! Help me! Come quick!" I heard my daughter's screams from the upstairs bathroom and bounded up the stairs two at a time. Was she hurt? Bleeding? Was the toilet overflowing? When I arrived in the doorway, she pointed at the wall and at the floor. The wall was splattered with black nail polish. On the floor laid a broken bottle in an expanding pool of black ooze. "It slipped," she sobbed.

 

The best guest post on The Do Loop was written by Rick's son for a science project about dryer balls.

Do dryer balls reduce drying time?

Hi! My name is David Wicklin. This blog post describes a research project and statistical graphs that I created for the 2013 ASA Poster Competition. My poster began one day when my sister saw a small box on a store shelf that contained two spiked plastic balls. The balls were about the size of tennis balls. The box said that you should put these "dryer balls" in your dryer with a load of wet laundry. The box claimed that the balls would "reduce drying time by up to 25%." I was skeptical. Could putting plastic balls in the dryer really reduce drying time?

 

What about the post with the most media attention? Of course, it involves M&Ms, but did you know it was written up in a Quartz magazine article, "A statistician got curious about M&M colors and went on an endearingly geeky quest for answers."

The distribution of colors for plain M&M candies

Many introductory courses in probability and statistics encourage students to collect and analyze real data. A popular experiment in categorical data analysis is to give students a bag of M&M® candies and ask them to estimate the proportion of colors in the population from the sample data. In some classes, the students are also asked to perform a chi-square analysis to test whether the colors are uniformly distributed or whether the colors match a hypothetical set of proportions.

 

Next is Rick's post with the most Wikipedia traffic, which gets a lot of visitors from the Wikipedia entry on the same topic.

What is Mahalanobis distance?

I previously described how to use Mahalanobis distance to find outliers in multivariate data. This article takes a closer look at Mahalanobis distance. A subsequent article will describe how you can compute Mahalanobis distance.

 

And finally, the post with the best headline is debatable, but I like this one.

Skew this

The skewness of a distribution indicates whether a distribution is symmetric or not. A distribution that is symmetric about its mean has zero skewness. In contrast, if the right tail of a unimodal distribution has more mass than the left tail, then the distribution is said to be "right skewed" or to have positive skewness. Similarly, negatively skewed distributions have long (or heavy) left tails.

What are your favorite posts from The Do Loop? Help Rick celebrate by commenting here or rushing over to his 1,000th post and telling him there how much you appreciate his content.

Tags blogging
Share

About Author

Alison Bolen

Editor of Blogs and Social Content

+Alison Bolen is an editor at SAS, where she writes and edits content about analytics and emerging topics. Since starting at SAS in 1999, Alison has edited print publications, Web sites, e-newsletters, customer success stories and blogs. She has a bachelor’s degree in magazine journalism from Ohio University and a master’s degree in technical writing from North Carolina State University.

2 Comments

  1. Michelle Homes

    What a great highlight blog Allison - thank you!!! Oh I have to say the Spiral of Splatter was one of my favourite and memorable DO Loop blog posts.
    Other favourites... there are so many.... Each Monday and Wednesday evening is always a treat as I wait to see what interesting blog post Rick publishes early his morning. I really enjoy the variety, depth, humor and well-written posts and the opportunity to learn and share from reading them. Thanks Rick for your wonderful contributions to the SAS community and I look forward to continuing to share and admire in awe for years to come!

    FYI, sharing this blog post and my sentiments on LinkedIn too - https://www.linkedin.com/feed/update/urn:li:activity:6495849769288724480

    See you again at SASGF soon!

Leave A Reply

Back to Top