Evaluating Teachers Using Student Test Scores: IMPACT (Part 2)

Since 2009, a new system of teacher evaluation has been put into practice in Washington, D.C. called IMPACT. The “Teaching and Learning Framework” for IMPACT lays out a crisp definition of “good” teaching in what D.C. teachers call the “nine commandments”:

1. Lead well-organized, objective-driven lessons.

2. Explain content clearly.

3. Engage students at all learning levels in rigorous work.

4. Provide students with multiple ways to engage with content.

5. Check for student understanding.

6. Respond to student misunderstandings.

7. Develop higher-level understanding through effective questioning.

8. Maximize instructional time.

9. Build a supportive, learning-focused classroom community.

IMPACT uses multiple measures to judge the quality of teaching. Fifty percent of an annual evaluation is based upon student test scores; 35 percent based on judgments of instructional expertise drawn from five classroom observations by the principal and “master educators,” and 15 percent based on other measures. Using these multiple measures, IMPACT has awarded 600 teachers (out of 4,000) bonuses ranging from $3000 to $25,000 and fired nearly 300 teachers judged as “ineffective” in its initial years of full operation. For those teachers with insufficient student test data, different performance measures were used.

Here is a description of one classroom being observed by a “master educator.”

“A case in point is the lively classroom of Andrea Stephens (not her real name), a first-grade teacher at a racially mixed elementary school in Northeast D.C.

Master educator [Cynthia] Robinson-Rivers is conducting an informal observation as Stephens teaches a lesson about capital letters, punctuation marks, and the short “a.” Stephens is kind, firm, and engaging, and she wins points for gestures like asking a reluctant pupil if she could “get one of his smiles,” making him feel valued. But she is apparently not engaging enough. Several students are not paying attention; one is a mugger and a performer, and he can’t sit still. After several attempts to quiet him, Stephens gently pulls him up next to her, holding his hand while she addresses the rest of the class.

The general atmosphere suggests to Robinson-Rivers a need for better management. “The children weren’t completely out of control,” Robinson-Rivers says. “But if they  aren’t facing you it can suggest a lack of interest.”

The session reveals other perceived shortcomings, despite Robinson-Rivers’ respect for Stephens as “a warm, thoughtful practitioner.” It was too teacher directed, Robinson-Rivers says; it failed to make the objectives fully clear, and it didn’t make the most of limited instructional time. “If the pacing is too slow, you can lose valuable time from the lesson,” Robinson-Rivers says. “If in a 20-minute morning meeting the kids participate in a variety of engaging activities, it’s much easier to maintain their interest and enthusiasm.” Stephens also falls short on Teach 5 [one of the “commandments”]—checking to see whether students actually understood her. “There was no way to know whether the shy girl or the boy who spoke little English understood or not,” Robinson-Rivers says. Instead of having all the pupils answer in unison, she suggests that Stephens cold-call on individual students, or have all the boys or all the girls answer in some non-verbal way. “It’s hard because teachers do think they are checking for understanding. But it’s actually an easy one for professional development; you could just say there are three easy things you can do.” Stephens, whose overall score for the year was in the “effective” range, is open to evaluation and receptive to feedback—she even asked for an extra observation—and in this regard, master educators say she is fairly typical.”

Just as any system of evaluation with observers using subjective criteria can distinguish between acceptable and unacceptable performances, IMPACT does sort out on a four-point scale “effective” from “ineffective” teachers according to their nine commandments of “good” teaching. That the use of VAM and other indicators reward and punish teachers is also clear. Questions, however, about teacher turnover—has IMPACT reduced attrition among “effective” teachers?—and student performance—has IMPACT improved test scores? have gone unanswered thus far. Allegations of teachers and administrators erasing student answers and fiddling with tests further cloud the picture of IMPACT’s influence on teaching practice and student achievement.

Among teachers and principals, the degree to which IMPACT has influenced daily teaching is disputed. According to some teachers, there are colleagues who pull out special lessons when principals and “master educators” appear for 30-minute unannounced visits. Other teachers tremble and panic when an evaluator walks into their classroom and the lesson becomes a shambles.  And there are many teachers who relish the feedback they get from conferences after observations and assert that they have made changes and their lessons have improved. While the principal’s workload demanded by teacher observations has become unmanageable for some, others have welcomed the role of instructional leader. Some principals, however, find IMPACT destroying their ways of supervising  teaching in their schools.

What has occurred in Washington, D.C. with new curriculum standards, new tests, and IMPACT mirrors what has occurred in urban districts across the country that have put into place testing and accountability structures. Since the early 1990s and especially after the passage of No Child Left Behind most urban schools across the nation have narrowed their curriculum leaving less time for non-tested subjects, intensified teacher-centered instruction with more lessons devoted to preparing students for tests, and, in general, reduced instructional time for reaching all of the curricular standards they are expected to meet over the school year. New ways of evaluating teachers that anchor judgments of effectiveness, in part, on student test scores certainly reinforces accountability but to what degree teaching practices across elementary and secondary school classrooms have improved as a consequence of IMPACT or similar programs is yet undetermined. And whether those improved teaching practices have led to gains in student academic achievement, well, no one knows.



Filed under how teachers teach, school reform policies

14 responses to “Evaluating Teachers Using Student Test Scores: IMPACT (Part 2)

  1. While all of the “nine commandments” are understandabley included in the make up of good teaching, to rigidly assess another by them leads to at least one wrong conclusion by Ms. Robinson-Rivers. She concludes that Ms. Stephens is not “engaging enough” because one first grade boy, who is noticabley more active than the other students, requires direct attention for a time. In the school in which I work there is at least one student like that in each class. Ms. Stephens, in my opinion, did exactly the right thing in that moment. She settled him down and then was able to go on with the lesson. Perhaps this little boy has ADHD or has learned at home that to get noticed he needs to act in exaggerated ways or has learning problems not yet identified. IMPACT looks like something developed by an obssessive-compulsive (Michelle Rhee) and as such has to be used in too rigid a manner. I suspect too that the master educators are probably fairly obssessive and compulsive themselves. While organization is certainly needed in teaching, obssessiveness leads to rote ways of learning which is needed only at times. Thankfully, Ms. Robinson-Rivers was able to see the overall quality of this teacher but other “master educators” may not.

    • larrycuban

      Thanks for the careful read of the “master educator” observation of a lesson and running it through your experiences.

  2. I write an education column for our local “Patch” online newspaper. By coincidence I just finished a piece on the way I am evaluated:

    It’s a story that apparently isn’t true, but ought to be. Czarina Catherine of Russia announced to her court that she would like to take a trip into the countryside to see how the serfs who lived in her realm were faring.
    Since the Russian leader was considerate enough to give fair warning of her intentions her consort, Grigori Potemkin, took advantage of the time lapse to plot out a carefully designed route for the Czarina’s trip. They even went so far as to construct a model village complete with happy serfs for Catherine to observe.
    We do the same things at our school.
    At our high school my teaching is observed by my supervisor, an assistant principal, every other year. Generally this consists of two or three visits to my classroom. These forays are planned ahead of time. I meet with the AP and we agree on a date and a class period to serve as a sample of my craft. I lay out what I plan to do.
    I know that the kind of rating I receive will depend primarily on which of my classes will be observed.
    If the AP agrees to see me teach an upper track (honors) class I can relax. There will be no need to make special preparations. The students will be attentive and cooperative. What I do will show off my skills.
    If the AP suggests a lower track (CP) class it’s a whole different ballgame. I will need to plan carefully.
    In every CP class there are from two to six unruly kids who hate school and, often, hate me. Another fifteen or twenty are bored but not openly rebellious. The deeply disaffected kids have been referred to the office many times. I’ve called their parents to report misdeeds. I’ve denied them access to the bathroom because they can’t be trusted in the hallways. I’ve lost my temper with them.
    Will they take out their resentments towards me by acting up in front of my supervisor?
    I try to plan out the week’s lessons so that, on the appointed day, we are doing something that shows the kind of teacher I am. But, of course, I also try to add something that will keep the unhappy kids quiet and on task.
    I had one girl this year who, for whatever reason, displayed a high-spirited intelligence on the day of my inspection, All by herself she made the day’s lesson a deeply satisfying success.
    If the class is in a good mood, if they feel some goodwill towards me, if the room isn’t too cold, if the class is not immediately after lunch (when students are at their most fractious), then I generally get a good result.
    But if my students feel resentful towards me or towards the school, or if they find the subject dull, then I’m in trouble. Either way it’s not a true representation of my teaching.
    With so few observations every minute is magnified in importance. In two years I spend about 1,000 hours in the classroom. To assess this I am watched for about 150 minutes.
    Over the years I have tried to get my supervisor to drop in unannounced into my classroom on a regular basis so that she could weigh my pro’s and con’s based on a whole series of valid observations. With luck I could also come away with some helpful hints from someone who is, presumably, a skilled instructor.
    In eight years I have had fewer than three actual administrator visits that weren’t preplanned, no visit lasting more than five minutes. I’ve asked for more, I’ve pleaded for more, but the hectic pace of the school day doesn’t allow for such.
    And thus I am reduced to constructing my bi-annual Potemkin Villages.

    • larrycuban

      Thank you for describing how you are evaluated and what you do on those occasions when you are. Given your experiences, would you want the more intense and focused IMPACT process described in the post? If so, why? If not, why?

      • I hope you don’t mind this somewhat-flippant response.
        Here’s how I feel about the specifics of VAM:

        1. Lead well-organized, objective-driven lessons.
        a. I’m a bit of an anti-objective radical. If I set the objectives I’d be happy to be judged by how I carry them out, but my objectives are set in Sacramento and generally distract me from what I think I ought to be teaching.
        2. Explain content clearly.
        a. I try not to do this. My explanations generally get in the way of my students’ learning.
        3. Engage students at all learning levels in rigorous work.
        a. There’s a Frank Smith quote I don’t have at hand but something to the effect that learning is intrinsically enjoyable and definitely not work.
        4. Provide students with multiple ways to engage with content.
        a. Probably a reasonable goal. If I was observed often enough they’d be able to assess this, but I suspect they wouldn’t do enough observations to see this.
        5. Check for student understanding.
        a. I’ve been thinking a lot lately about this. Is there a Valhalla out there of understanding (generally called ‘deeper understanding’) that I’m venturing towards? Am I supposed to tell them why George killed Lennie? The more I think about this understanding thing the more I think they are judging a chimera that isn’t really there.
        6. Respond to student misunderstandings.
        a. There are times when I promote student misunderstandings. The bewildered look on their faces is precious to me. It’s useful for me to know when they are confused, I’ll admit, but I’d like the freedom to occasionally leave them in a state of confusion.
        7. Develop higher-level understanding through effective questioning.
        a. If you mean by this, student questions, then I’m all for it. If they are asking good questions I’m probably doing something right.
        8. Maximize instructional time.
        a. I despise this concept. Whenever I have a student teacher one of the first things I tell them is to forget about time efficiency. Efficiency is for factories.
        9. Build a supportive, learning-focused classroom community.
        a. This one I like, though I suspect no two people would define ‘supportive’ in the same way.

      • larrycuban

        Thanks for giving your views on each of the IMPACT’s “Nine Commandments. What about the visits of “master educators” and conferences with them and, of course, what about using students’ test scores?

      • heverlyj

        I like the idea of master educators helping me. I had one such this year and she did a great job. I don’t really understand how the test score system would work. You wrote last week: “VAM predicts how well a student would do based on student’s attendance, past performance on tests, and other characteristics.”
        Does that mean that they use this year’s attendance to predict how my student should do on the tests? If he’s absent many times they reduce the expectation?
        I teach ninth grade so I’m assuming I’d get judged by the California Standards tests we give in April? I’m really not sure how I think about this.

      • larrycuban

        If you have any teacher friends in Los Angeles Unified they can tell you how the annual test scores are used to judge their effectiveness. In IMPACT, “master educators” are observers whose written comments get factored into the overall judgment of teacher effectiveness.

  3. Since I am a teacher-in-training, and since my state is an early adopter of the Teacher Proficiency Assessment (Stanford / Pearson), I have spent the last year filming many, if not all, of my classroom interactions to show progress against things like the 9 commandments.

    [We know that video has some uses in instruction, so why not use video as a tool for evaluation? I have been impressed and learned from videos I have seen at The Teaching Channel, and the Teach Like a Champion / Doug LeMov videos, to name a few.]

    Proposal: film a large proportion of a teacher’s interactions, which are more authentic and natural, without the supervisor/administration in the room, and then allow for some review of the good and the bad.

    Is that an invasion of privacy? I wouldn’t think so, just like the district / employer who provides the e-mail and internet access has a right to screen for appropriate use. But that’s too adversarial in tone, would I relish the opportunity to go over my “best plays” with a professional learning community? Absolutely!

    • larrycuban

      Thanks, John, for your views on filming your classes and obvious eagerness for working in professional learning community I hope that comes to pass.

  4. David B. Cohen

    One quick thought – it seems to me a careful and thoughtful evaluator ought to ask questions about observations before jumping to conclusions. Seeing how a teacher interacts with a student or class in any given moment or day may be a function of what happened an hour ago, a day ago, or over the course of the year. Maybe there was some inquiry and conversation between evaluator and teacher that is not reflected here. I hope it’s expected.

    • larrycuban

      There was conferencing that did take place, David. IMPACT requires that of observations done by “master educators.” In this instance, the teacher had prior experiences with the observer.

  5. Pingback: Evaluating Teachers Using Student Test Scores: IMPACT (Part 2)By @LarryCuban « juandon. Innovación y conocimiento

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s