PROC GLM and other statistical modeling procedures have their own versions of such an item with their ESTIMATE (and CONTRAST) statements. They allow you to assess whether one scenario is better than another based on your data, and provide a way to make informed decisions.
For example, consider a study to explore what affects memory. One theory is that material is recalled as a function of how much it is processed when first encountered. Fifty younger subjects and 50 older subjects (between 55 and 65 years old) were randomly assigned to one of five learning groups: Counting (read a list of words and count the number of letters in each word); Rhyming (read each word and think of a rhyming word); Adjective (think of modifiers for each word); and Imagery (form vivid images of each word). None of these groups were informed that they would need to recall the words at a later time. The last learning group, Intentional, was told to memorize the words for later recall. After the subjects had gone through the list of 27 items three times, they were asked to write down all the words they could remember.
Of interest is the number of words recalled (Words) and the categorical predictor variables Age (Younger/Older) and Process(Adjective, Counting, Imagery, Intentional, and Rhyming). One interesting hypothesis is that forming vivid images of each word would be more effective for later recall as compared to intentionally memorizing each word; this can be formally tested with an ESTIMATE (or CONTRAST) statement in PROC GLM.
The results of the study can be visualized in the following series of paneled boxplots:
The number of words recalled appears to differ by Process and by Age. When analyzing this data set, it seems reasonable to include an interaction of Process by Age, since the histograms of Process appear to differ by Age:
proc glm data=recall; class age process; model words=age|process; run; quit;
The ANOVA table that follows indicates a significant interaction between Process and Age, making an ESTIMATE statement more challenging to write.
|Source||DF||Type III SS||Mean Square||F Value||Pr > F|
- To obtain the coefficients for the contrast, set up a two-way table as shown below: use the first variable on the CLASS statement as the ROW variable (Age) and the second variable on the CLASS statement as the COLUMN variable (Process). Sort the levels of the variables alphanumerically, as shown.
- Next fill in the body of the chart with coefficients for the comparison of interest: compare the IMAGERY method (averaged over Older and Younger) to the INTENTIONAL method (also averaged over Younger and Older).
- Label the last column and last row as 'Marginal'. Then fill in the blank cells within the body of the chart with zeros.
- Lastly, sum across the rows and down the columns to obtain the marginal coefficients.
The marginal coefficients provide the coefficients for the main effects: the row marginal coefficients are for the variable Age and the column marginal coefficients are for the variable Process. (Note that the marginal coefficients sum to 0 in both directions.) The coefficients in the body of the chart provide the coefficients for the Age*Process interaction.
The ESTIMATE (or CONTRAST) statement would follow the MODEL statement in your call to PROC GLM with the syntax shown below. The coefficients for the interaction term are obtained by reading within the body of the table: first across row 1 from left to right, then across row 2 left to right, then row 3 left to right:
estimate 'Compare Imagery to Intentional Memorizing (both averaged over age groups)' Age 0 0 Process 0 0 1 -1 0 Age*Process 0 0 0.5 -0.5 0 0 0 0.5 -0.5 0;
The high p-value indicates that not enough evidence is present to reject the null hypothesis that vivid imagery and intentional memorization result in the same number of words recalled. Although the two methods do not differ significantly in the number of words recalled, vivid imagery may result in more interesting study sessions!
|Parameter||Estimate||Standard Error||t Value||Pr > |t||
|Compare Imagery to Intentional Memorizing (both averaged over age)||-0.15000000||0.89585465||-0.17||0.8674|
To learn more, take our Statistics 2: ANOVA and Regression training course.
 (Howell, D. C. (1999). Fundamental Statistics for the Behavioral Sciences, 4th Edition. Duxbury Press, Pacific Grove, California.
 Since the coefficients for Age are both zero, Age may be omitted from the ESTIMATE statement.