Transitioning value-added and growth models to new assessments

This summer’s education conferences have been dominated by sessions discussing the “next generation,” Common Core aligned assessments in English and mathematics. As 44 states plan for the transition from their state tests to the new PARCC and Smarter Balanced Assessment Consortium assessments, SAS has received repeated questions from our partners across 20 states about how this will affect growth and value-added models.

From our perspective, transitioning value-added models across assessments is not a new challenge. We have worked with many states and districts that have undergone testing changes in the past two decades, and I thought to use this venue to share some of the strategies and considerations SAS employs during these transitions.

The PARCC and SBAC assessments will first be piloted, and then fully implemented in 2014-15. With certain growth models, calculating growth estimates will not be possible until at least the end of the 2015-2016 school year, when two years of data are available. However, given these limited inputs, the reliability and precision of the resulting growth measures will be affected until at least three prior test scores are available. This need not be the case with the SAS models.

Because EVAAS consists of multiple models, I will briefly explain how SAS would approach this challenge differently with two of our statistical models in order to seamlessly provide growth measures in the first year of any new assessment.

1. EVAAS’s multivariate response model (MRM) is typically used in consecutive-given tests, such as the End-of Grade tests often given in math and reading for grades 3-8. This approach converts scale scores to a Normal Curve Equivalent (NCE) distribution, which is an equal-interval scale similar to a percentile ranking. This model can compare student progress within a given year or to a base year, and this is a policy, rather than statistical, decision. When testing regimes change, however, SAS recommends using a within-year approach, where intra-year NCE scores create a comparable scale across the old and new assessments given in different years. More specifically, in 2013-14, scale scores from the prior state assessment would be converted into NCE scores based on the 2013-14 distribution. The process would then be repeated in 2014-15 with the new PARCC/SBAC scale scores, based on the 2014-15 distribution. The growth expectation in this case would represent maintaining the same relative position in statewide student achievement distribution from year to year. So, by definition, on average for this year, the state would make an average growth measure of about 0 in each subject and grade.

This flexible approach allows educators to receive value-added estimates in the first year of the new assessment while incorporating the previous testing history on each student in order to ensure reliable and precise results. In this transition year, about half of the teachers/schools/districts would be above the growth expectation and about half would be below. However, in the years following the change in tests, educators and policymakers have the option of switching to a base year approach where potentially all teachers/schools/districts could be above the growth expectation. In this case, SAS recommends tethering the base year to year two of the new assessment to stabilize the student achievement measures and allow teachers to get used to the new content being taught.

2. In contrast, EVAAS’s Univariate Response Model (URM) is often used for non-consecutive tests, such as the End-of Course tests in Algebra, Geometry, Biology, etc. These tests can be given across multiple grade levels (usually 7-12) and gains are not measured from one grade to the next. What is most important with the URM is that the new test has some relationship to prior tests. For example, a new Algebra 1 exam should be correlated to previous math scores in 8^th and 7^th grades, and perhaps other subjects such as reading and science. Once we identify the relationships between the old and new tests, we can use that information to make a prediction for each student’s performance based on their previous performance- even in the first year of a new assessment. The URM already uses a within-year approach to setting the growth expectation, and SAS advises that assessment transitions do not pose a challenge at all with this model.

The bottom line is that all states will want to consult closely with an experienced growth/value-added provider to discuss the pros and cons of various approaches to this assessment transition in order to make the decision that will be most fair and advantageous given diverse state policy goals. Certain groups like SAS have been working with these technical challenges for a long time and are happy to share their experiences as states seek to understand how to smoothly transition to the next generation assessments in 2014-15. Please feel free to share what other strategies you are seeing states or districts use to prepare for this important assessment transition.

Blogs

Blogs

Transitioning value-added and growth models to new assessments

About Author

1 Comment