The dominant standard used by most policymakers, media editors, and administrators to judge success is effectiveness: Have you done what you said you were going to do and can you prove it? In a society where “bottom lines,” Dow Jones averages, sports statistics, and vote-counts matter, quantifiable results determine success.
Even before No Child Left Behind, policymakers have relied on the effectiveness standard to examine what students have learned by using proxy measures such as student test scores, college attendance, and other indicators. For example, in the late-1970s policymakers concluded that public schools had declined because scholastic aptitudes test (SAT) scores had plunged downward. Even though test-makers and researchers repeatedly stated that such claims were false—falling SAT scores fueled public support for states raising academic requirements in the 1980s. What mattered most to decision-makers and media were numbers that could be used to establish school rankings, thereby creating easily identifiable winners and losers.
Yet test results in some instances proved unhelpful in measuring a reform’s success. Consider the mid-1960s’ evaluations of Title I of the Elementary and Secondary Education Act (ESEA). They revealed little improvement in low-income children’s academic performance thereby jeopardizing Congressional renewal of the program. Such evidence gave critics hostile to federal initiatives reasons to brand President Lyndon Johnson’s War on Poverty programs–failures.
Low test scores, however, failed to diminish the program’s political attractiveness to constituents and legislators. Each successive president and Congress has used that popularity as a basis for allocating funds to needy students in schools across the nation including No Child Left Behind.
Popularity, then, is a second standard that public officials use in evaluating success. The spread of an innovation and its hold on the imagination of voters, has meant that fashionableness easily translates into political support for reform. The rapid diffusion of special education, bilingual education, accountability, and computers in schools since the 1980s are instances of innovations that captured both policymakers’ and practitioners’ attention. Few educators or public officials questioned large outlays of public funds for these popular reforms because they were perceived, at least at first, as resounding successes.
A third standard used to judge success is assessing how well innovations mirrored what reformers intended. This fidelity standard assesses the fit between the initial design, the formal policy, the subsequent program, and its implementation.
Champions of the fidelity standard ask: How can anyone determine effectiveness if the reform departs from the blueprint? If federal, state, or district policymakers, for example, adopt and fund a new reading program because it has proved to be effective elsewhere, local implementers (e.g., teachers and principals) must follow the original program design as they put it into practice or else the desired outcomes will not be achieved. When practitioners add, adapt, or even omit features of the original design, then policymakers, heeding this standard, say that the policy and program cannot be determined effective because of these changes.
Where do these dominant standards of effectiveness, popularity, and fidelity come from? Policymakers derive the criteria of effectiveness and fidelity from viewing organizations as rational tools for achieving desired goals. Through top-down authority, formal structures, clearly specified roles, and technical expertise, administrators and practitioners can get the job done.
Within organizations where rational decision-making and control are prized, policymakers ask: Have the prescribed procedures been followed (fidelity) and have the goals been achieved (effectiveness)? Hence, in judging reforms, those who carry out the changes must be faithful to the design before the standard of effectiveness in achieving goals is invoked.
But how have these beliefs about rational organizations become embedded in these standards? The growth of professional expertise, or what Donald Schön calls “technical rationality,” has come to be anchored in university-credentialed knowledge especially since World War II. This expertise, often grounded in the natural, physical, and social sciences, is located in professional training programs at universities. Rather than favoring practitioner expertise derived from schools and classrooms, public officials and researchers use this scientifically grounded knowledge to evaluate whether reforms have succeeded
Contrary to the effectiveness and fidelity standards, popularity derives from the political nature of public institutions and the astute use of symbols (e.g., tests, pay-for-performance, computer) to convey values. Schools, for example, are totally dependent on the financial and political support of local communities and the state. Taxpayer support for, or opposition to, bond referenda or school board initiatives is often converted into political capital at election time. Whether an innovation spreads (e.g., charters) and captures public and practitioner attention becomes a strong basis for evaluating its success.
Seldom are these standards debated publicly, much less questioned. Unexamined acceptance of effectiveness, fidelity, and popularity avoids asking the questions of whose standards will be used and what alternative standards can be used to judge reform success and failure. The next post takes up these questions.