"Easy button" for ESTIMATE statements

0

My previous blog demonstrated the most difficult type of ESTIMATE statement to write—a two-way (or higher) ANOVA with interactions. An "easy button" for ESTIMATE statement comes by having a simpler model.

Models with only main effects and no interactions make writing ESTIMATE statements straightforward.  Consider first a one-way ANOVA. A study was conducted at the University of Melbourne[1] exploring the pain thresholds of blonds and brunettes. Subjects were divided into four categories according to hair color: light blond, dark blond, light brunette, and dark brunette. Each person in the experiment was given a pain threshold score based on performance in a pain sensitivity test (the higher the score, the higher the person’s pain tolerance). The variables on the data set are the outcome variable Pain and the four-level predictor variable HairColor.

The code for the ANOVA using PROC GLM would be:

proc glm data=pain;
  class HairColor;
  model Pain=HairColor;
run;

The ANOVA table indicates that the pain scores aren't the same for all hair colors.

Source DF Type III SS Mean Square F Value Pr > F
HairColor 3 1360.72631 453.575439 6.79 0.0041

While the boxplots (default output for PROC GLM in SAS 9.3) allow you to visualize the differences, the only way to ascertain whether differences are significant is to use an ESTIMATE (or CONTRAST) statement.

Perhaps you would like to compare average pain scores of blondes and brunettes. You can obtain the coefficients easily by examining the Class Level Information table. First, notice that the levels of HairColor are sorted alphanumerically—DarkBlond, DarkBrunette, LightBlond, and LightBrunette. This will be important in correctly placing the coefficients.

Class Level Information
Class Levels Values
HairColor 4 DarkBlond DarkBrunette LightBlond LightBrunette

You would average the pain scores for dark and light blondes by applying coefficients of 0.5 to those levels (1st and 3rd); then compare that to the average pain scores of dark and light brunettes by applying coefficients of -0.5 to those levels (2nd and 4th). The coefficients should be applied left to right to the four levels of HairColor, as shown in the Values column, resulting in:   0.5, -0.5, 0.5, -and 0.5.

The syntax would then be:

proc glm data=pain;
  class HairColor;
  model Pain= HairColor;
  estimate 'Compare Blondes to Brunettes'
            HairColor 0.5 -0.5 0.5 -0.5;
run;

The results that follow indicate that blondes have higher pain thresholds than brunettes by 15.25 points, on average.

Parameter

Estimate

Standard Error

Value

Pr > |t|

Compare Blondes to Brunettes

15.250000

3.7672492

4.05

0.0011

While this ESTIMATE statement comes from a one-way ANOVA, the approach will be the same for ANOVA models with more than one factor, as long as no interactions are present.

In the next blog, we'll look at the EASIEST of all ESTIMATE statements—continuous variables not involved in interactions or higher order terms. Until then, enjoy your new "easy button"! (To learn more, take our Statistics 2: ANOVA and Regression training course.)


[1] From the OzDasl website 

Share

About Author

Chris Daman

Sr Analytical Training Consultant

Chris Daman is a statistical training specialist and course developer in the Education Division at SAS. She has more than 20 years of teaching experience—both nationally and internationally—in the fields of programming, statistics, and mathematics. Before joining SAS in 2005, she taught classes at N.C. State University and IBM, worked in the pharmaceutical and financial industries, and was a survey statistician at an international research organization. She currently teaches advanced statistics courses covering mixed models, generalized linear mixed models, hierarchical linear models, and design of probability surveys; in addition, she teaches design of experiments and analysis of complex data, such as longitudinal data, multilevel data, or data from complex surveys. She also teaches data mining classes, including applied analytics and advanced decision trees. She has a bachelor's degree in mathematics from the University of North Carolina at Greensboro and a master's degree in statistics from N.C. State University. Chris's favorite part of teaching is the interaction with the students. To keep them involved with the material and each other, she often uses a variety of teaching techniques (such as analogies, optical illusions, stories, object lessons, and group interactions) rather than the standard instructor-to-student lecture format. As a result, students give high ratings to her classes and typically include comments such as "I enjoyed Chris's teaching style very much. She did an excellent job of engaging the class and fostering interactions between all the students and herself" or "I love Chris's sense of humor. It definitely helps you get through complicated material". In her spare time, Chris enjoys dancing, reading, spending time with her family, and traveling.

Comments are closed.

Back to Top