Algorithms, Accountability, and Professional Judgment (Part 3)

So much of the public admiration for Big Data and algorithms avoids answering basic questions: Why are some facts counted and others ignored? Who decides what factors get included in an algorithm? What does an algorithm whose prediction might lead to  getting fired actually look like? Without a model, a theory in mind, every table, each chart, each datum gets counted, threatens privacy, and, yes, becomes overwhelming. A framework for quantifying data and making algorithmic decisions based on data is essential. Too often, however, they are kept secret or, sadly, missing-in-action.

Here is the point I want to make. Big Data are important; algorithmic formulas are important. They matter. Yet without data gatherers and analyzers using frameworks that make sense of the data, that asks questions about the what and why of phenomena–all the quantifying, all the regression equations and analysis can send researchers, policymakers, and practitioners down dead ends. Big Data become worthless and algorithms lead to bad decisions.

Few champions of Big data have pointed to its failures.  All the finely-crafted algorithms available to hedge fund CEOs, investment bankers, and Federal Reserve officials before 2008, for example, were of no help in predicting the popping of the housing bubble, the near-death of the financial sector, the spike in unemployment, and the very slow recovery after the financial crisis erupted.

So Big Data, as important as it is in determining which genes trigger certain cancers, shaping strategies for marketing products, and identifying possible terrorists, still hardly becomes a solution to problems in curing diseases, losses in advertising revenues, or terrorist actions. Frameworks for understanding data, asking the right questions, constant scrutiny, if not questioning, of the algorithms themselves, and professional judgment are necessities in making decisions once data are collected.

In the private sector the business model of decision-making (i.e., profit-making and returns on investment) drives interpretations of data, asking questions, and making organizational changes. It works most of the time but when it fails, it fails big. That business model has migrated to public schools.

In the past half-century, the dominant model for local, state, and federal decision-making in schools has become anchored in student performance on standardized tests. It is the “business model” grafted onto schools. If students score above the average, the model says that both teachers and students are doing their jobs well. If test scores fall below average, then changes have to be made in schools.

State and federal accountability regulations and significant penalties have been put into place (e.g., No Child Left Behind) that have set in concrete this model of test score-driven schooling.   Algorithms that distribute benefits and penalties for individual students, teachers, and schools are the steel rods embedded in the concrete that strengthen the entire structure leaving little room for teachers, principals, and superintendents to use their professional judgments.

Nonetheless, in fits and starts the entire regulatory model of performance-driven schooling  has come slowly under scrutiny by some policymakers, researchers, practitioners, and parents. Teachers, administrators, and parents have spoken out against too much standardized testing and constricting what students learn. These protests point to fundamental reasons why criticizing the use of Big Data and algorithmic decision-making has taken hold and is slowly spreading.

First, unlike private sector companies, tax-supported schools are a public enterprise and accountable to voters. If high-stakes decisions e.g., grading a school “F” and closing it) driven by algorithms are made, those decisions need to be made in public and those algorithm-driven rules on, say, evaluating teacher effectiveness (e.g., value-added measures in Los Angeles and Washington, D.C.), need to be transparent. easily understandable to voters and parents, and undergo public scrutiny.

Google, Facebook, and other companies keep their algorithms secret because they say revealing the formula they have created would give their competition valuable information that would hurt company profits. School districts, however, are public institutions and cannot keep algorithms buried in jargon-laden technical reports that are released months after consequential decisions on schools and teachers are made (see Measuring Value Added in DC 2011-2012).

Second, within a regulatory, test-driven structure teacher and principal expertise about students, how much and how they learn, school organization, innovation, and district policies has been miniaturized and shrink-wrapped to making changes in lessons based on test results delivered to individual schools.

Teacher and principal judgments about academic and non-academic performance of students matter a great deal. Such data appear in parent-teacher conferences, retention decisions when teachers meet with principals, and the portfolio of observations about individual students that teachers compile over the course of a school year. Teachers and principals use algorithmic decisions but they are seldom quantified and put into formulas. It is called professional judgment. Such data and thinking seldom, if ever, show up in official judgments about individual students, a class, or school. Such data are absent in mathematical formulas that judge student, teacher, and school performance.

Yet there are instances when professional judgments about regulations and tests make news. Two high school faculties in Seattle refused to give the Measures of Academic Progress (MAP) test recently. New York principals have lobbied the state legislature against standardized testing.

Such rebellions, and there will be more, are desperate measures. They reveal how professional expertise of those hired to teach and lead schools has been ignored and degraded. They also reveal the political difficulties facing professionals who decide to take on the regulatory test-driven model that use Big Data and algorithmic decision-making. Protesters appear to be against being held accountable and for preserving their jobs.

That is a must-climb political mountain that can be conquered. In questioning policymaker use of standardized tests to determine student futures, grade schools and judge teacher effectiveness, teachers and principals end up questioning the entire model of  regulatory accountability and algorithmic decision-making borrowed from the private sector. It is about time.

16 Comments

Filed under school reform policies

16 responses to “Algorithms, Accountability, and Professional Judgment (Part 3)

  1. Jeff Bowen

    Thoughtfully stated. I think one the big problems of standardized testing to allocate penalties and benefits is that they glorify competition. Algorithms that create losers and winners are nowhere near as educationally effective as a growth model where everyone is valued as a learner and there is collective gain.

    • larrycuban

      I wish those who decide on the formulas would agree with you. The overall regulatory test-driven framework, alas, calls for sorting out winners and losers.

  2. This is wonderful, and needs to be more widely disseminated. Funnily enough, your post came just as I was finishing a blog entry on my own district’s benchmarking project: http://hyperbolicguitars.blogspot.com/2013/01/benchmarks-and-teacher-performance.html

    I think the interaction of local perspectives with those like you who have a higher-level view of these issues are going to really aid in conquering the political mountain you’ve described.

    • larrycuban

      I read your post on creating math benchmark tests and all of the questions that you raised when it comes to using the results of those tests to judge teacher performance. Thanks for sending it to me and commenting.

  3. Bob Calder

    First of all, I want to say that education is probably not a “Big Data” phenomenon. Big data references the vastness of the expanding universe of data that is relatively open and available to integrate with other data to facilitate discovery. When the new Square Kilometre Array radio telescope goes online, it will generate as much data in a week as currently exists on the Internet. You are probably (?) thinking the problem is having so much noise, we cannot find the signal. But joining diverse collections is one way large projects discover new things. Yes, it’s a problem, but it’s a known. The ed problem is that the collections are considered homogenous when they’re not.

    I’m not a statistician, but I believe the issue is the power of linear thinking and the mistrust of Bayseian thinking. The certainties of testing are likely uncertain. The difficulty lies is the fact that there is good work being done by the likes Bruce Baker and Nate Silver being ignored.

    • larrycuban

      Don’t know, Bob, how Big you mean in the phrase Big Data but most districts have student data sets collected by the state and district that cover test scores, SES, attendance, and a host of other factors. For me, that is Big Data. How to make sense of such data sets depends on the questions asked and those questions derive from the frameworks that decision-makers have in their heads. Not sure what you are saying in your last paragraph.

  4. Ian Rae

    Great summary of the issue. Do you have any examples of teacher-created student performance measurements that could be used as a common benchmark district-wide or state-wide?

    In IT, after years of similar (although not nearly as politically driven) initiatives to measure software developer performance, there was a revolt, which is now called agile development. It’s based on bottom-up control; the team, not management, decides on when & how work gets done.

    That being said, measurement is necessary. Otherwise organizations fall prey to the squeaky wheel phenomenon, where resources are allocated by who shouts loudest, not by where it can do the most good.

  5. larrycuban

    Ian,
    For a few years in the 1990s, portfolio assessments of student work in writing and other products with accompanying metrics were big among progressive educators–I recall the fuss in New York state to have portfolios substitute for standardized tests; it did not happen. Yes, particular standardized tests, criterion-referenced tests, and other tools are essential as long as they are used for the purposes intended by test designers.

  6. Ian Rae

    Thanks. Portfolio assessments sounds interesting. Instead of the standardized test being a separate thing, the student’s school work is the test.
    Bob Calder has a point about Big Data, which generally means extensive, consistent, and accurate data that can be searched by algorithms for patterns. Measuring student performance seems fraught with issues, even standardized tests. In Ontario we test each student once every three years! Education has a Small Data problem. Seems a very bad fit for algorithm-based decision making.

    • larrycuban

      As for algorithms being a bad fit for Big Data in U.S. schools, Ian, they are being crafted anyway for determining the value added by teachers to test scores in Washington, D.C., Los Angeles, and other school districts.

  7. This is great stuff, I’m glad I found it through a reader of my blog. I write about similar things from the perspective of a mathematician (see http://mathbabe.org/2013/01/14/should-the-u-s-news-world-reports-college-ranking-model-be-open-source/ and http://mathbabe.org/?s=value+added+mode). If you are in the New York area I’d love to have coffee and discuss if we could team up somehow, especially at what seems to be a critical moment for the New York public school teachers negotiation.

    Thanks,
    Cathy

  8. Pingback: Why Progressives Should Care About The Backlash On Standardized Testing | Change the Stakes

Leave a reply to Bob Calder Cancel reply