Last week I talked about how I volunteered to serve as a judge for a middle-school science fair. As I expected, I enjoyed the experience quite a bit, and I hope the students got something positive from me as well. I evaluated several really impressive projects at the 7th grade level. The "first place" project was presented by a girl who built a homemade calorimeter (and used some math) to measure the amount of energy in different types of horse feeds. It turns out that horse feed is not labeled like human food; manufacturers do supply nutrition information, but not the amount of calories. Different horses need widely different amounts of caloric intake based upon their size and activity level. (Yes, I learned a lot.)
As I mentioned in my previous post, my 6th-grade daughter presented a project to answer this question: which will react more when you add Mentos mint candy to it: Diet Coke or Diet Pepsi? Her hypothesis was that Diet Coke should have the higher reaction, because its higher density allows the soda to fill the microscopic dimples on the surface of the candy more efficiently and thus produce a greater force, owing to the surface tension. (Yes, she has inherited my gift for Making Things Up. I'm sure she really had no idea, but she was rooting for Diet Coke.)
According to the results of her tests, her hypothesis was proved correct (although who knows for what reason?). The project rubric required her to use Microsoft Excel to present the data and charts, but fortunately I'm not restricted in that way when reporting to you here.
I recoded her results into SAS, and then used PROC TTEST and ODS GRAPHICS to produce these charts to compare the results between Diet Pepsi and Diet Coke. This first chart shows the results for 3 trials, with 3 pairs of 2-liter bottles tested in each trial (a total of 18 bottles, or 9 for each brand).
Two trials were conducted on the same day when the ingredients were "fresh" from the store. But trial 3 was conducted weeks later as the materials sat in our garage, exposed to heat and cold over time. The trial 3 numbers were much lower for both brands, and so these skew the overall results. Here's another chart with trial 3 excluded, and you can see a more dramatic difference between the two brands.
How would you analyze these data? I don't have formal training as a statistician, so it's probable that I've missed something important. At the end of this post, there is a SAS program that contains the numbers. If you have some good ideas for a different way to look at this, post back in the comments.
Update 26Jan2011, 8pm: I found a data entry error in my numbers! I corrected the data and refreshed the results.
data results; length brand $ 12 trial 8 height 8; label height="Height (cm)"; infile datalines dsd; input brand trial height; datalines; Diet Coke,1,160 Diet Coke,1,145 Diet Coke,1,183 Diet Coke,2,152 Diet Coke,2,168 Diet Coke,2,229 Diet Coke,3,91 Diet Coke,3,76 Diet Coke,3,84 Diet Pepsi,1,114 Diet Pepsi,1,152 Diet Pepsi,1,150 Diet Pepsi,2,175 Diet Pepsi,2,107 Diet Pepsi,2,137 Diet Pepsi,3,89 Diet Pepsi,3,74 Diet Pepsi,3,61 ; run;