Partly tongue-in-cheek, partly serious criticism of teacher evaluation schemes using value-added algorithms, this guest blog by Aaron Pallas, Professor of Sociology and Education at Teachers College, Columbia, gets at the increasingly bizarre testing frenzy that has seized the nation’s school districts.The post appeared September 26, 2011 at a Sociological Eye on Education.
I’m beginning to think that the District of Columbia isn’t that serious about evaluating its teachers.
Sure, D.C. has its vaunted IMPACT evaluation system that combines value-added measures of teachers’ contributions to their students’ mastery of reading and mathematics with observations of teachers’ practices inside and outside the classroom. And DCPS has used IMPACT to reward some teachers while firing others presumed to be ineffective and incorrigible, replacing them with newer, shinier and cheaper teachers.
But the system seems to be whiffing on some obvious opportunities to extend the evaluation of teachers to other school subjects. If the subjects are important, don’t they deserve this consideration? Without value-added measures of teachers’ contributions to students’ outcomes in areas other than reading or math, how will we know which teachers are successful, which ones need some remedial help, and which ones should be fired?
I’m referring specifically to health and sex education, and what elementary, middle and high-school students know about how to avoid a range of risky behaviors, such as teen pregnancy, sexually transmitted diseases, and drug and alcohol abuse. Washington Post reporter Bill Turque reported recently that a 50-item standardized test of what students know about human sexuality, contraception and drug use will be administered to students in grades 5, 8 and 10 as part of the Spring 2012 administration of the D.C. Comprehensive Assessment System (DC CAS).
DC’s Office of the State Superintendent of Education later clarified to Turque that students and teachers will not receive individual scores — at least for now — which prompted Turque to write on his D.C. Schools Insider blog, “So, just to review: No data for parents. No accountability for teachers. Why is this a meaningful tool?”
Turque is looking at the forest. I, on the other hand, am concerned about the trees. There’s a standardized test, kids should get scores, and teachers should be held accountable for those scores. Teenagers don’t teach themselves, do they?
IMPACT already has a technology for linking students’ test scores to their teachers, and using complex statistical models to account for the fact that the students in one teacher’s class might differ from those in another teacher’s classroom. DCPS tell us that these models take into account factors outside a teacher’s control, isolating a particular teacher’s contributions to her students’ achievement. Sure, there wouldn’t be a “pre-test” score for health knowledge for a few years, but DCPS could use student performance on the reading and math DC CAS tests as a pre-test, because students who do well on one standardized test tend to do well on other tests. The experts say that the models don’t have to be precise–even imperfect measures of students’ lives outside of school and prior achievement levels will improve the ability to rank teachers according to their relative effectiveness.
But to be honest, I don’t think this approach goes far enough. As is true with so many aspects of education, there’s book knowledge, and there’s the practical knowledge that is applied in daily life on the job, in the community, in the voting booth, and elsewhere. If we’re trying to teach kids to avoid risky behaviors, why not measure them directly, rather than giving a multiple-choice test on the birds and the bees? Surely DCPS could gather direct information on the incidence of teen pregnancy, STDs, and drug and alcohol abuse among its students. If particular teachers are supposed to be teaching this content, DCPS could construct value-added models of a teacher’s contribution to his or her students’ rate of risky behaviors, and be confident that the models isolate the teacher’s contribution to these rates from those pesky family and community factors. What are they waiting for?
After all, the evaluation of teaching about dental dams should have some teeth to it.