Data sgp is an R package for analyzing student growth percentile (SGP) data. It provides a set of lower level functions that do the calculations for SGP analyses and higher level wrappers that allow you to do more elaborate operations. The package is accompanied by four exemplar data sets, sgpData, sgpData_LONG, sgptData_LONG and sgpData_INSTRUCTOR_NUMBER that provide examples for how to set up your own data sets for SGP analyses. The sgpData_LONG and sgptData_LONG data sets are LONG formatted while the sgpData_INSTRUCTOR_NUMBER contains a student-instructor look up table for producing teacher level summaries.
SGPs are a popular measure of achievement progress that can be used to evaluate teachers and schools (Betebenner, 2009). They rank students by their current achievement levels relative to the performance of other students with similar prior achievements. Ranking by SGP is seen as more fair and relevant to teacher evaluation than examining unadjusted test scores. However, a growing body of research has demonstrated that estimated SGPs based on standardized test scores are noisy measures of their underlying latent achievement attributes (Akram, Erickson, & Meyer, 2013; Lockwood & Castellano, 2015; McCaffrey, Castellano, & Lockwood, 2015; Monroe & Cai, 2015; Shang, Van Iwaarden, & Betebenner, 2015).
The large error variance in SGP estimates makes them difficult to interpret at the individual student level. Furthermore, the relationships between true SGPs and student characteristics create an additional source of noise when aggregating SGPs to the teacher level. The large spread of the distributions in Figure 2 suggests that a nontrivial portion of this variation may be due to these relationships.
This is particularly problematic for SGPs aggregated to the teacher level, which are used to assess teacher effectiveness. The errors in estimated SGPs are not random but are the result of systematic differences between teachers in the types of students they teach and the way in which they measure student performance. This is an additional source of error that should be controlled for in value-added models that regress student test scores on teacher fixed effects, prior test scores and student covariates.
Our analyses show that the large error variance in SGP estimates can be reduced by conditioning on more information about the students. The plot in Figure 1(a) shows the root mean squared error (RMSE) of conditional estimators based on different amounts of additional information about the students. The upper curve shows RMSEs for estimators conditioned on the math scores only, the middle curve is for estimators conditioned on the ELA scores only and the bottom curve is for estimators conditioned on both the math and the ELA scores. The RMSEs decrease as the amount of additional information increases, but the error still remains large even for relatively low reliability levels associated with typical standardized tests. The resulting high uncertainty around estimated teacher effects renders aggregated SGPs useless as indicators of teacher effectiveness. In contrast, the error in value-added models is much more manageable.