Louis B. Mayer, a Russian immigrant who went into the film business, founded Metro-Goldwyn-Mayer (MGM) in 1924. He produced the blockbuster film of its day, Ben Hur, and wanted to join high-class society. Friends told him that upper-class men played golf. In his gruff way, he began playing golf but he never got the hang of the sport, failing to understand that you scored the game in strokes. He thought golf was a race and that finishing in a shorter time meant you were getting better. He hired two caddies with the first one posted down the fairway to find the ball after he hit it. Then, the second caddy would run ahead and put himself in position for Mayer’s second shot. This continued for eighteen holes. After the game was over, Mayer would look at his watch: “We made it in an hour and seven minutes! Three minutes better than yesterday!”
Judging success often depends upon whether those playing the game understand the basic logic of the game and whether the metric used to determine success fits that logic.
Mayer’s beliefs about golf and the metric he used were mistaken. It is equally tempting to use wrong measures to assess a school reform. Metrics that have little to do conceptually with the “game” of education, especially when complicated reforms such as mayoral control, charter schools, and small high schools are put into practice–may appeal to policy makers and even satisfy parents and taxpayers but are unconnected to the reform policy’s assumptions. Judging the record of such initiatives requires getting the basic idea of the reform right, that is, realizing that it’s golf strokes that matter, not the speed.
Consider small high schools. The basic policy logic of the reform or theory of action—is to take large comprehensive high schools and create new structures of small learning communities and advisories that would personalize schooling for students sufficiently to transform teaching and engage students sufficiently for them to perform well on tests and graduate ready for college or enter jobs in a rapidly developing information-driven economy. Converting big schools into small learning communities, then, will transform teaching and learning thereby producing desired policy outcomes.
The metrics most often used to determine the success of small high schools in big-city districts are largely test scores, achievement gaps between minorities and whites, dropouts, high school graduation rates, and college admissions. Research studies of small high schools, however, are, at best, unclear as to whether these policies produce the desired outcomes across the different measures.
Fitting the logic of the policy logic to outcomes also depends on when the assessment occurs. Thus, judging a reform too early in its unfolding often yields inconclusive data—showing who’s ahead at the end of nine holes of golf rather than after eighteen.
Determining policy success, then, depends on timing but of equal importance, whether the assumptions contained in the reform’s logic have indeed been put into practice, and the fit between the metrics used and the expected outcomes.
Few big cities judge their reforms in this manner. Yet when test scores go up, mayors, school boards and superintendents are quick to take credit for the gains, even though it is unclear which of the implemented reforms caused the gains. When test scores dip, reasons spill forth people for the disappointing results and blame gets dispersed. Few, however, can go beyond guesses as to why scores declined.
So to avert both premature and mistaken judgments, I propose four questions to judge whether the logic of the policy (the game of golf) and metrics (strokes) are in sync.
1. Did reform structures and programs aimed at improving student achievement get implemented?
Without complete implementation, no determination of policy success or its validity as a theory of change can be made.
2. Have teaching practices changed in the intended direction? Without altering typical classroom routines consistent with policy assumptions and strategies, chances of students learning more, faster, and better are slim.
3. Did changed classroom practices account for what students learned? Without systematic inquiry into which classroom practices have changed, it is impossible to determine whether student learning, even if only measured by test scores, is due to a close match between what teachers did daily (e.g., deeper probing into subject-matter content, focusing on test preparation) or to some other factors.
4. Did what students learn achieve policy makers’ desired outcomes? Do higher test scores for students translate into a reduced achievement gap between minorities and whites, fewer students dropping out of high schools, and more graduating seniors going on to college?
Few policymakers ask these questions and then seek answers to them. As a result, they end up like Louis Mayer: playing golf by the wrong metrics.