Human Tutoring vs. Computer Tutoring: Who Wins? (Mike Goldstein)

Mike Goldstein founded MATCH Charter Public High School in Boston. He is involved in many ventures including the one below. The post appeared October 7, 2011. For another analysis of Houston’s “no excuses” turnaround of failing schools and the tutoring component, see Matt Di Carlo’s  post.

[I]n Summer 2010 we sent a team to Houston. Their job was to design and launch a massive cohort of full-time math tutors. This was 1 part of a 5-pronged effort (called Apollo 20) to do “turnarounds” of 9 of their lowest-performing schools.

The news:

1. Roland Fryer, the economist who put this whole thing together … published a scholarly paper analyzing the effects of the Apollo effort.

2. Ericka Mellon of the Houston Chronicle wrote a news article … about that study.

Tutoring seems to work

Fryer’s research found that the tutoring – pairing one tutor with two students – was extremely effective, equating to between five and nine extra months in school….

One new wrinkle in this discussion. If computer-based tutoring could get anywhere CLOSE to the effect of human tutoring, of course we’d propose computer tutoring! Much, much cheaper. And computer tutoring so durned hot right now.

Houston created a natural experiment. Only 6th and 9th graders got MATCH-style human tutoring, and that’s where scores rose. Kids in every other grade got computer tutoring, along with a longer day and different teachers, yet test scores didn’t move that much.

Here is the description of the 2 approaches:

For all sixth and ninth grade students, one period Monday through Thursday was devoted to receiving two-on-one tutoring in math. The total number of hours a student was tutored was approximately 189 hours for ninth graders and 215 hours for sixth graders. All sixth and ninth grade students received a class period of math tutoring every day, regardless of their previous math performance. The tutorials were a part of the regular class schedule for students, and students attended these tutorials in separate classrooms laid out intentionally to support the tutorial program.

This model was strongly recommended by the MATCH School, which has been successfully implementing a similar tutoring model since 2004. The justification for the model was twofold: first, all students could benefit from high-dosage tutoring, either to remediate deficiencies in students’ math skills or to provide acceleration for students already performing at or above grade level; second, including all students in a grade in the tutorial program was thought to remove the negative stigma often attached to tutoring programs that are exclusively used for remediation.

We hired 250 tutors – 230 were from the greater Houston area, 3 moved from other parts of Texas, and 17 moved from outside of Texas. Tutors were paid $20,000 with the possibility of earning an average bonus of $3,500 based on tutor attendance and student performance….

Tutor candidates were recruited from lists of Teach for America and MATCH applicants; additionally, the position was posted on college and university job boards at over 200 institutions across the country. We partnered with a core team of MATCH alumni who helped screen, hire, and train tutors based on the “No Excuses” philosophy, and develop a curriculum tightly aligned with Texas state standards.

In non-tutored grades – seven, eight, ten, eleven, and twelve – students received a “double dose” of math or reading – if they were below grade level – in the subject in which they the furthest behind.

This provided an extra 189 hours for high school students and 215 hours for middle school students of math/reading instruction for students who are below grade level. The curriculum for the extra math class was based on the Carnegie Math program. The Carnegie Math curriculum uses personalized math software featuring differentiated instruction based on previous student performance. The program incorporates continual assessment that is visible to both students and teachers.

The curriculum for the extra reading class utilized the READ 180 program. The READ 180 model relies on a very specific classroom instructional model: 20 minutes of whole-group instruction, an hour of small-group rotations among three stations (instructional software, small-group instruction, and modeled/independent reading) for 20 minutes each, and 10 minutes of whole-group wrap-up. The program provides specific supports for special education students and English Language Learners. The books used by students in the modeled/independent reading station are leveled readers that allow students to read age-appropriate subject matter at their tested lexile level. As with Carnegie Math, students are frequently assessed to determine their … level in order to adapt instruction to fit individual needs.

Computers are great for helping people learn what they want to learn. They’re not particularly good at getting someone to learn something they do not want to learn. For that, you need very skilled people (teachers and tutors) who can build relationships, use that to generate order and effort from kids, and then turn that effort into learning. A computer needs to start on “third base” — take effort and flip that into learning.

I think Steve Jobs had it right [in doubting technology]….

Taken together:

We show that the average impact of these changes on student achievement is 0.276 standard deviations in math and 0.059 standard deviations in reading, which is strikingly similar to reported impacts of attending the Harlem Children’s Zone and Knowledge is Power Program schools – two strict “No Excuses” adherents.

So that’s big.

Then we separate the “whole effect” into “high-dosage tutoring” and “everything else” — and high-dosage tutoring is creating most of the gains, and the other stuff is kind of “bringing down the average.”

For example, Grade 6 math has an effect size of +.484 standard deviations…..versus .119 in Grades 7 and 8. So gains that are more than 400% bigger. It’s double a typical KIPP effect.

Grade 9 math has an effect size of (a whopping) .739 SDs…..versus .165 in Grades 10 and 11. Same thing.

That’s not to say we can’t leverage computers to help deliver much better tutoring. My view is simply that we need a lot more experimentation on adult/computer tutoring duos, where the computer extends the reach/capacity of a skilled human. …


Filed under how teachers teach, technology use

11 responses to “Human Tutoring vs. Computer Tutoring: Who Wins? (Mike Goldstein)

  1. Evan

    Speaking of computers extending the reach/capacity of a skilled human, what do you think about the Khan Academy model/proposal?

    • larrycuban

      I watched the TED lecture with Salman Khan–thanks for sending the URL. I think that those elementary and secondary math teachers who like the idea of assigning the video as “homework” and then the in-class lesson becomes the teacher going over individually, in small and whole groups the problems that students raise and need help on from the video should do exactly that. I believe some teachers are doing that and I applaud the initiative. The worse thing that could occur with his idea of flipping homework would be for principals, superintendents, or school boards mandating it as a teaching practice.

  2. Bob Calder

    I have some experiences observing programs used for teaching reading in 9th grade over a semester baseline. The program I saw disguised technique used to teach the illiterate that are embedded in the school population. The site staff were unaware of it.

    Computers appear to be great for bringing up a cohort of non-readers from zero to sixty. As you said, the part of the class that doesn’t *want* to read is not going to engage. However, the definition of *want* is slippery. I have seen this effect in a class evenly split between Haitian earthquake victims and level one readers in 9th grade. The newly arrived Creole speakers were eager and accomplished much. The recalcitrant US born learners had been “distilled” by the filtering system of schooling by 9th grade. There is a very high probability that there were severe learning disabilities represented in the group that are not going to be remediated by anything the school system can afford to deliver. Moreover, the school system is unwilling to allocate funds to diagnose their problems that may include stress damage (very high cortisol levels) many of us may experience much later in life for which *we* receive medical treatment. So the portion of the population that has these problems is invisible to research. These kids don’t belong in an experiment. Education research values “reality” in a dysfunctional fashion and I am loath to engage in the equivalent of tossing bleeding accident victims into a population of disease victims in a drug study.

    I also question studies evaluating the effectiveness of human versus AI – let’s call it what it is to prevent magical thinking about computers. Can AI mimic the process a human uses? It’s really a sort of Turing test that evaluates the meaning of information in a channel. Can it even be done in an uncontrolled environment? I’m skeptical.

    Do we know where the AI delivery system breaks down? Let’s say the AI ends up working like Kahn Academy. What are the learning gains going to look like when you ask higher level questions that require applying principles and reasoning? Things that aren’t part of the usual state testing regimen.

    Coherent vision for the path research should take is lacking in Washington and much of what is being done in the name of research at the state level is utterly dangerous. The culture of education fails to censure, which is a failure to apply scientific principles. Sometimes politics conflicts with reality and that’s where people who value process must insist that reality be given consideration.

  3. The adult/computer duo has been around for a while. I’m 21, and remember going to Kaplan’s Score! computer/aid tutoring 6-7 years ago; honestly, I remember a lot more information from a good teacher than a computer software. MATCH Schools sound like a great idea, I’m all for creating jobs for educators.

    Also, I hear a lot of college students complain about hybrid classes (partial internet based curriculum) and full online classes because they require a lot of attention to due dates and participation. It’s a little different than adult/computer tutoring, especially in grade school and below, but it’s cheap like the article mentioned and experimental (professors and students are adapting).

  4. Couldnt agree with you more Larry “That’s not to say we can’t leverage computers to help deliver much better tutoring”.

    Whoever has suggested it is either/or is missing the point?

  5. At the elementary level (4th grade), I have seen first-hand the impact that my after-school “homework club” has on student achievement. Not quite 1-on-2 ratio (more like 1-on-8) but the impact of 30 minutes of attention focused on exactly what that student needs can be powerful.

    I am very aware that a large part of the impact of this tutoring approach is based upon the student-teacher relationship; learning is important to these kids because it is important to me, and they like me (or at least don’t dislike me).

    Not going to get that from a computer.

    Don’t think I don’t support technology usage (I’m a district tech trainer and run our building’s computer lab). But it is a tool, and only that.

    By the way, Idaho is trying something interesting that has the teachers up in arms. They are mandating that high school students take a class or two online in order to graduate. Of course it is a money-saving measure, but I can’t argue with the fact that kids need to be exposed to this; my own son must take online courses at his college this quarter.

    Betsy Weigle

  6. John Thompson

    The point of Fryer’s study was to evaluate whether it was possible to scale up No Excuses. The one finding that could be scalable, it seems to me, is that the high-quality tutoring worked, but Fryer worries about its $2500 per student costs. He doesn’t seem to consider dropping the failed parts to pay for the successful one. I guess that’s part of No Excuses fidelity to its entire model.

    But Mike, maybe you can fill us in on attrition. The year started for the Apollo 20 experiment with students who were 86.6% economically disadvantaged, but the results were based on the sample of students who stuck it out to test time, and they were 61% economically disadvantaged. Fryer reported 8,600 “Observations,” presumably for two tests. Does that mean that Apollo started with about 7300 students and ended with about 4,300?
    Mike, if you don’t know, maybe you can ask Fryer, and ask him also why he didn’t give us those numbers.

  7. Pingback: Human tutors beat computers in Houston — Joanne Jacobs

  8. Pingback: How to help failing, frustrated students — Joanne Jacobs

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s