The dominant standard used by most policymakers, media editors, and administrators to judge success is effectiveness: Have you done what you said you were going to do and can you prove it? In a society where “bottom lines,” Dow Jones averages, sports statistics, and vote-counts matter, quantifiable results determine success. No Child Left Behind and its focus on standardized test scores is effectiveness on steroids.
Yet even before No Child Left Behind, policymakers had relied on the effectiveness standard to examine what students have learned by using proxy measures such as state test scores, college attendance, and other indicators. For example, in the late-1970s policymakers concluded that public schools had declined because scholastic aptitudes test (SAT) scores had plunged downward. Even though test-makers and researchers repeatedly stated that such claims were false—falling SAT scores fueled public support for states raising academic requirements in the 1980s. What mattered most to decision-makers and media were numbers that could be used to establish school rankings, thereby creating easily identifiable winners and losers.
Note, however, that test results in some instances proved unhelpful in measuring a reform’s success. Consider the mid-1960s’ evaluations of Title I of the Elementary and Secondary Education Act (ESEA). They revealed little improvement in low-income children’s academic performance thereby jeopardizing Congressional renewal of the program. Such evidence gave critics hostile to federal initiatives reasons to brand President Lyndon Johnson’s War on Poverty programs as failures.
Low test scores, however, failed to diminish the program’s political attractiveness to constituents and legislators. Each successive president and Congress has used that popularity as a basis for allocating funds to needy students in schools across the nation including No Child Left Behind.
Popularity, then, is a second standard that public officials use in evaluating success. The spread of an innovation and its hold on the imagination of voters, has meant that fashionableness can translate into political support for reform. The rapid diffusion of special education, bilingual education, accountability, and computers in schools since the 1980s are instances of innovations that captured both policymakers’ and practitioners’ attention. Few educators or public officials questioned large outlays of public funds for these popular reforms because they were perceived, at least at first, as resounding successes.
A third standard used to judge success is assessing how well innovations mirrored what reformers intended. This fidelity standard assesses the fit between the initial design, the formal policy, the subsequent program, and its implementation.
Champions of the fidelity standard ask: How can anyone determine effectiveness if the reform departs from the blueprint? If federal, state, or district policymakers, for example, adopt and fund a new reading program because it has proved to be effective elsewhere, local implementers (e.g., teachers and principals) must follow the original program design as they put it into practice or else the desired outcomes will not be achieved. When practitioners add, adapt, or even omit features of the original design, then policymakers, heeding this standard, say that the policy and program cannot be determined effective because of these changes.
Where do these dominant standards of effectiveness, popularity, and fidelity come from? Policymakers derive the criteria of effectiveness and fidelity from viewing organizations as rational tools for achieving desired goals. Through top-down authority, formal structures, clearly specified roles, and technical expertise, administrators and practitioners can get the job done.
Within organizations where rational decision-making and control are prized, policymakers ask: Have the prescribed procedures been followed (fidelity) and have the goals been achieved (effectiveness)? Hence, in judging reforms, those who carry out the changes must be faithful to the design before the standard of effectiveness in achieving goals is invoked.
Popularity as a standard in judging success, of course, comes from the political domain. Schools are dependent upon taxpayers voting funds to operate schools. What voters determine is successful–regardless of the lack of or ambiguity in the evidence–gets renewed year after year.
The authority and therefore the power to put into place one or more of these criteria in the U.S. derive from the 50 states (see Tenth amendment to the U.S. Constitution). States establish local districts which directly govern its schools–there are about 14,000 districts in the U.S. California has over 1,000 districts, Virginia has 227, and the state of Hawaii governs all of its schools as one district. States, then, set overall criteria for success. Most states choose effectiveness criteria with occasional bows to popularity and fidelity. Local districts run the schools and try to meet those criteria. Since 2002, however, federal legislation–yes, the No Child Left Behind Act–sets effectiveness criteria–test scores–for the states which then, in turn, demand that local districts adhere to that standard. The entire debate in the U.S. Congress to reauthorize NCLB has hinged upon who will have the authority to set the criteria for success, the federal or state government.