The dominant standard used by most policymakers, media editors, and administrators to judge success is effectiveness: What is the evidence that the policy has produced the desired outcomes? Have you done what you said you were going to do and can you prove it? In a society where “bottom lines,” Dow Jones averages, Super Bowl victories, and vote-counts matter, quantifiable results determine effectiveness.
Since the Elementary and Secondary Education Act (1965), federal and state policymakers have relied on the effectiveness standard to examine what students have learned by using proxy measures such as test scores, high school graduation rates, college attendance, and other indicators. For example, in the late-1970s policymakers concluded that public schools had declined because scholastic aptitudes test (SAT) scores had plunged downward. Even though test-makers and researchers repeatedly stated that such claims were false—falling SAT scores fueled public support for states raising academic requirements in the 1980s and adding standardized tests to determine success. With the No Child Left Behind Act (2001-2016) test scores brought rewards and penalties. [i]
Yet test results in some instances proved unhelpful in measuring a reform’s success. Consider the mid-1960s’ evaluations of Title I of the Elementary and Secondary Education Act (ESEA). They revealed little improvement in low-income children’s academic performance thereby jeopardizing Congressional renewal of the program. Such evidence gave critics hostile to federal initiatives reasons to brand President Lyndon Johnson’s War on Poverty programs as failures. [ii]
Nonetheless, the program’s political attractiveness to constituents and legislators overcame weak test scores. Each successive U.S. president and Congress, Republican or Democrat, have used that popularity as a basis for allocating funds to needy students in schools across the nation including No Child Left Behind (2001) and its successor, Every Student Succeeds Act (2016). Thus, a reform’s political popularity often leads to its longevity (e.g., kindergarten, comprehensive high school, Platoon School).
Popularity, then, is a second standard that public officials use in evaluating success. The spread of an innovation and its hold on voters’ imagination and wallets has meant that attractiveness to parents, communities, and legislators easily translates into long-term political support for reform. Without the political support of parents and teachers, few innovations and reforms would fly long distances.
The rapid diffusion of kindergarten and preschool, special education, bilingual education, testing for accountability, charter schools, and electronic technologies in schools are instances of innovations that captured the attention of practitioners, parents, communities, and taxpayers. Few educators or public officials questioned large and sustained outlays of public funds for these popular reforms because they were perceived as resounding successes. And they have lasted for decades. Popularity-induced longevity becomes a proxy for effectiveness. [iii]
A third standard used to judge success is assessing how well innovations mirrored what designers of reforms intended. This fidelity standard assesses the fit between the initial design, the formal policy, the subsequent program, and its implementation.
Champions of the fidelity standard ask: How can anyone determine effectiveness if the reform departs from the design? If federal, state, or district policymakers, for example, adopt and fund a new reading program because it has proved to be effective elsewhere, teachers and principals must follow the blueprint as they put it into practice or else the desired outcomes will go unfulfilled (e.g., Success for All). When practitioners add, adapt, or even omit features of the original design, then those in favor of fidelity say that the policy and program cannot be determined effective because of these changes. Policy adaptability is the enemy of fidelity. [iv]
Where do these dominant standards of effectiveness, popularity, and fidelity come from? Policymakers derive the criteria of effectiveness and fidelity from viewing organizations as rational tools for achieving desired goals. Through top-down decisions, formal structures, clearly specified roles, and technical expertise, administrators and practitioners can get the job done.
Within organizations where rational decision-making and control are prized, policymakers ask: Have the prescribed procedures been followed (fidelity) and have the goals been achieved (effectiveness)? Hence, in judging reforms, those who carry out the changes must be faithful to the design before the standard of effectiveness in achieving goals is invoked.
But where do these beliefs embedded in these criteria come from? The growth of professional expertise in the private and public sectors, or what Donald Schön calls “technical rationality,” is grounded in the natural, physical, and social sciences and located in corporate training and professional education programs at universities. Rather than favoring practitioner expertise derived from schools and classrooms, public officials and researchers use this scientifically grounded knowledge to evaluate the degree to which reforms are effective. [v]
Contrary to the effectiveness and fidelity standards, popularity derives from the political nature of public institutions and the astute use of symbols (e.g., tests, pay-for-performance, computers) to convey values. Schools, for example, are totally dependent on the financial and political support of local communities and the state. Taxpayer support for, or opposition to, bond referenda or school board initiatives is often converted into political capital at election time. Whether an innovation spreads (e.g., charters) and captures public and practitioner attention becomes a strong basis for evaluating its success.[vi]
Seldom are these criteria debated publicly, much less questioned. Unexamined acceptance of effectiveness, fidelity, and popularity avoids asking the questions of whose standards will be used, how they are applied and alternative standards can be used to judge reform success and failure.
Although policymakers, researchers, practitioners have vied for attention in judging the success of school reforms, policy elites, including civic and business leaders and their accompanying foundation- and corporate-supported donors have dominated the game of judging reform success.
Sometimes called a “growth coalition,” these civic, business, and philanthropic leaders see districts and schools as goal-driven organizations with top officials exerting top-down authority through structures. They juggle highly prized values of equity, efficiency, excellence, and getting reelected or appointed. They are also especially sensitive to public expectations for school accountability and test scores. Hence, these policy making elites favor standards of effectiveness, fidelity, and popularity—even when they conflict with one another. Because the world they inhabit is one of running organizations, their authority and access to the media give them the leverage to spread their views about what constitutes “success.” [vii]
So it is no surprise whose criteria are applied become harnessed to the how they are applied within K-12 organizations. For the most part, decisions flow downward. Elected leaders in coalition with top civic figures often take innovations directed at school improvement, package and deliver the reform (e.g., curriculum, instruction, school re-organization) to classrooms through official policies and procedures. While there are other ways for reforms to enter schools such as from the local school community and teachers and principals—from the bottom up—the top-down political decision to impose a reform on the organization from federal, state, and district leaders has been the dominant pattern in the history of school reform. [viii]
The world that policy elites inhabit, however, is one driven by values and incentives that differ from the worlds that researchers and practitioners inhabit. Policymakers respond to signals and events that anticipate reelection and media coverage. They consider the standards of effectiveness, fidelity, and popularity rock-hard fixtures of their policy world. [ix]
Most practitioners, however, look to different standards. Although many teachers and principals have expressed initial support for high-performing public schools serving the poor and children of color, most practitioners have expressed strong skepticism about test scores as an accurate measure of either their effects on children or the importance of their work.
Such practitioners are just as interested in student outcomes as are policymakers, but the outcomes differ. They ask: What skills, content, and attitudes have students learned beyond what is tested? To what extent is the life lived in our classrooms and schools healthy, democratic, and caring? Can reform-driven programs, curricula, technologies be bent to our purposes? Such questions, however, are seldom heard. Broader student outcomes and being able to adapt policies to fit the geography of their classroom matter to practitioners.
Another set of standards comes from policy and practice-oriented researchers. Such researchers judge success by the quality of the theory, research design, methodologies, and usefulness of their findings to policy and student outcomes. These researchers’ standards have been selectively used by both policy elites and practitioners in making judgments about high- and low-performing schools. [x]
So multiple standards for judging school “success” are available. Practitioner-and researcher- derived standards have occasionally surfaced and received erratic attention from policy elites. But it is this strong alliance of policymakers, civic and business elites, and friends in the corporate, foundation, and media worlds that relies on standards of effectiveness, fidelity, and popularity. This coalition and their standards continue to dominate public debate, school reform agendas, and determinations of “success” and “failure.”
[i] Patrick McGuinn, No Child Left Behind and the Transformation of Federal Education Policy, 1965-2005 (Lawrence, KS: University Press of Kansas, 2006)
[ii]Harvey Kantor, “Education, Reform, and the State: ESEA and Federal Education Policy in the 1960s,” American Journal of Education, 1991, 100(1), pp. 47-83; Lorraine McDonnell, “No Child Left Behind and the Federal Role in Education: Evolution or Revolution?” Peabody Journal of Education, 2005 80(2), pp. 19-38.
[iii] Michael Kirst and Gail Meister, “Turbulence in American Secondary Schools: What Reforms Last,” Curriculum Inquiry, 1985, 15(2), pp. 169-186; Larry Cuban, “Reforming Again, Again, and Again,” Educational Researcher, 1991, 19(1), pp. 3-13.
[iv]Janet Quinn, et. al., Scaling Up the Success For All Model of School Reform, final report, (Santa Monica (CA): Rand Corportation, 2015).
[v]Donald Schon, “From Technical Rationality to Reflection in Action,” in Roger Harrison, et. al. (editors), Supporting Lifelong Learning: Perspectives on Learning, vol. 1, pp. 40-61.
[vi] David Labaree, “Public Goods, Private Goods: The American Struggle over Educational Goals,” American Educational Research Journal, 1997, 34(1), pp. 39-81; Amanda Datnow, “Power and Politics in the Adoption of School Reform Models,” Educational Evaluation and Policy Analysis, 2000, 22(4), pp.357-374.
[vii] Sarah Reckhow, Follow the Money: How Foundation Dollars Change Public School Politics (New York: Oxford University Press, 2013); Frederick Hess and Jeff Henig (eds.) The New Education Philanthropy: Politics, Policy, and Reform (Cambridge, MA: Harvrd Education Press,, 2015).
[viii] Linda Darling Hammond,”Instructional Policy into Practice: The Power of the Bottom over the Top,” Educational Evaluation and Policy Analysis, 1990, 12(3), pp. 339-347. Charles Payne, So Much Reform, So Little Change (Cambridge, MA: Harvard Education Press, 2008). Joyce Epstein, “Perspectives and Previews on Research and Policy for School, Family, and Community Partnerships,” in(New York: Routledge, 1996), pp. 209-246.
[ix] Anita Zerigon-Hakes, “Translating Research Findings into Large-Scale Public Programs and Policy,” The Future of Children, Long-Term Outcomes of early Childhood Programs, 1995, 5(3), pp. 175-191; Richard Elmore and Milbrey McLaughlin, Steady Work (Santa Monica, CA: RAND Corporation, 1988);
[x] Thomas Reeve, “Can Educational Research Be Both Rigorous and Relevant,” Educational Designer, 2008, 1(4), at: http://www.educationaldesigner.org/ed/volume1/issue4/article13/
Burke Johnson and Anthony Omwuegbuzie, “Mixed Methods Research,” 2004, Educational Researcher, 2004, 33(7), pp. 14-26.