“Meredith Broussard (@merbroussard) is a data journalism professor at New York University and the author of “Artificial Unintelligence: How Computers Misunderstand the World.” She is working on a book about race and technology.” This op-ed piece appeared in the New York Times on September 9, 2020.
Isabel Castañeda’s first words were in Spanish. She spends every summer with relatives in Mexico. She speaks Spanish with her family at home. When her school, Westminster High in Colorado, closed for the pandemic in March, her Spanish literature class had just finished analyzing an entire novel in translation, Albert Camus’s “The Plague.”She got a 5 out of 5 on her Advanced Placement Spanish exam last year, following two straight years of A+ grades in Spanish class.
And yet, she failed her International Baccalaureate Spanish exam this year.
When she got her final results, Ms. Castañeda was shocked. “Everybody believed that I was going to score very high,” she told me. “Then, the scores came back and I didn’t even score a passing grade. I scored well below passing.”
How did this happen? An algorithm assigned a grade to Ms. Castañeda and 160,000 other students. The International Baccalaureate — a global program that awards a prestigious diploma to students in addition to the one they receive from their high schools — canceled its usual in-person final exams because of the pandemic. Instead, it used an algorithm to “predict” students’ grades, based on an array of student information, including teacher-estimated grades and past performance by students in each school.
Ms. Castañeda wasnot alone in receiving a surprising failing grade — tens of thousands of International Baccalaureate students protested their computer-assigned grades online and in person. High-achieving, low-income students were hit particularly hard: many took the exams expecting to earn college credit with their scores and save thousands of dollars on tuition.
Nor was the International Baccalaureate the only organization to use a computer program to assign students grades amid the pandemic. The United Kingdom’s in-person A-level exams, which help determine which universities students go to, were also canceled and replaced with grades-by-algorithm. Students who were denied university entrance because of these imaginary grades took to the streets, chanting anti-algorithm slogans. Only after an uproar did the government change course, though many students were left in limbo without university admission.
The lesson from these debacles is clear: Algorithms should not be used to assign student grades. And we should think much more critically about algorithmic decision-making overall, especially in education. The pandemic makes it tempting to imagine that social institutions like school can be replaced by technological solutions. They can’t.
The bureaucrats who decided to use a computer to assign grades are guilty of a bias I call technochauvinism: the idea that technological solutions are superior. It’s usually accompanied by equally bogus notions like, “Computers make neutral decisions” or, “Computers are objective because their decisions are based on math.”
Computers are excellent at doing math, but education is not math — it’s a social system. And algorithmic systems repeatedly fail at making social decisions. Algorithms can’t monitor or detect hate speech, they can’t replace social workers in public assistance programs, they can’t predict crime, they can’t determine which job applicants are more suited than others, they can’t do effective facial recognition, and they can’t grade essays or replace teachers.
In the case of the International Baccalaureate program, grades could have been assigned based on the sample materials that students had already submitted by the time schools shut down. Instead, the organization decided to use an algorithm, which probably seemed like it would be cheaper and easier.
The process worked like this: Data scientists took student information and fed it into a computer. The computer then constructed a model that outputted individual student grades, which International Baccalaureate claimed the students would have gotten if they had taken the standardized tests that didn’t happen. It’s a legitimate data science method, similar to the methods that predict which Netflix series you’ll want to watch next or which deodorant you’re likely to order from Amazon.
The problem is, data science stinks at making predictions that are ethical or fair. In education, racial and class bias is baked into the system — and an algorithm will only amplify those biases.
Crude generalizations work for Netflix predictions because the stakes are low. If the Netflix algorithm suggests a show and I don’t like it, I ignore it and move on with my day. In education, the stakes are much higher. A transcript follows you for years; when I was 25 and well out of college, I applied for a job that asked for my SAT scores.
In Ms. Castañeda’s case, her failing grade was most likely due in part to the fact that historical performance data for her school was one of the inputs to the algorithm. The computer assumed that the students at Westminster, who are mostly low-income students of color, would continue to do poorly.
“Everyone I know got downgraded one or two levels,” Ms. Castañeda told me. “It’s not fair that our scores were brought down because of our school’s history. It’s unfair to punish students for where they live.”
Another input to the algorithm was teacher prediction of the students’ grades. Teachers tend to have lower expectations for Black and Brown students compared to white students; this bias is well known in the education community and ignored in the data science community. Thus a very human bias prevailed in the computational system.
International Baccalaureate and Ofqual, the agency that administers Britain’s A-level exams, have reluctantly realized that algorithmic grades were a mistake. Since the outcry over algorithm-assigned grades, both organizations have been sued. Many students, including Ms. Castañeda, ended up receiving new, higher imaginary grades.
Roger Taylor, chair of the Ofqual board, apologized in front of a House of Commons educational oversight committee this week. “We are sorry for what happened this summer,” he said. “With hindsight it appears unlikely that we could ever have delivered this policy successfully.”
As we stare down the fall semester online, there are going to be infinitely many technochauvinist calls to transform online education and use algorithmic tools that promise to evaluate individual student learning. Resist these calls.
2 responses to “When Algorithms Give Real Students Imaginary Grades (Meredith Broussard)”
It has created chaos here in the UK….instead of trusting the professional judgement of teachers (with a system of peer moderation),undermined the integrity of the whole assessment/qualification system and destryoed the credibility of the exam regulator ofqual and Education Ministers. The one positive to emerge from the confusion and stress it has created for learners and schools and colleges is the realisation that an assessment process predicated on norm referencing is no longer fit for purpose and should be reviewed as soon as….
Thanks for the comment about UK, Bob.