Oracle CEO Larry Ellison earns $37,692 an hour. No, that is not a typo or misplaced comma. Ellison’ annual salary ran $78.4 million, much of it in stock option awards. His salary was based on the annual performance of the company’s stock. Oracle’s Board of Directors set the pay scale (Ellison owns one-fourth of the company’s shares) to spur better management to increase profits and shareholders’ dividends.
They pay Ellison to perform well on the metric they have chosen (“company earnings before income taxes minus the costs of stock-based compensation, acquisitions, restructuring, and other items.” This CEO’s performance pay is not, however, a metric used by other major corporations for paying their top person. I return to the point of different measures used by companies to judge CEO performance later.
Switch now to the average U.S. public school teacher who earns an annual salary of over $55,000. That figure translates to around $27.00 for a 40-hour week. Like Ellison, hundreds of thousands of teachers are involved in pay-for-performance plans. In response to the federal Race To the Top competition, many states have mandated that teachers’ performance and salary be tied to students’ test scores to spur better teaching and student learning. Those test scores, as a factor in assessing effectiveness and determining salary (or bonuses), can range from as much as over half to one-quarter of the decision to set salary and retain or fire a teacher.
While I have written about this pay-for-performance reform over the past few years (see here, here, and here), for this post I want to inspect how the private sector–often a model for U.S. school reform–has its own problems, often undisclosed by business-oriented champions of school metrics, in determining CEO pay.
The lesson to learn from this post is: Paying for CEO performance in companies and schools is as flawed as the measures used to determine it.
A recent study of the metrics used in 195 large companies over the past five years showed that the most popular gauge measuring CEO performance was “total shareholder return.” Over half of the companies using that measure, however, lost nearly two percent over the five-year period. Companies using less popular equations such as “earnings-per-share growth” gained almost three percent.
Now, here’s the clincher. Most companies judging CEO performance are relying on a metric that yielded loses for investors (“total shareholder return”) yet, at the same time, those very same companies continued to give their CEOs substantial raises year after year.
The authors of the study believe that the popularity of the performance measure, i.e., “total shareholder return,” stems from how easy it is for boards of directors and CEOs to manipulate the metric by “removing costs from the equation” such as “discontinuing product lines or closing factories.” Boards of directors then can reward CEOs with higher compensation packages. Earnings-per-share growth, a less popular metric and one of multiple measures that many firms use, sorts out under-performing from high performing firms, the authors found. This one as well as other measures, they concluded, are less easily manipulated by top corporate officials. CEO pay, then, can be better associated with company performance.
The main takeaways from this study is that boards of directors and CEOs do manipulate the numbers, “one size does not fit all when measuring pay for performance,”and that multiple measures for determining effectiveness and salary have a better chance of capturing performance than single ones do.
Now, consider teacher pay-for-performance where one measure–student test scores–is often used to determine to what degree a teacher is effective. Like “total shareholder return” there are serious problems of using this metric alone or even in concert with other measures to judge teacher performance (see here and here).
Consider the following:
Incentives corrupt measures.
Since the mid-1970s, social scientists have criticized the use of specific quantitative measures to monitor or steer policies because those implementing such policies alter their practices to insure better numbers. The work of social scientist Donald T. Campbell and economists in the mid-1970s about the perverse outcomes of incentives was available but have largely been ignored. Campbell wrote in 1976.
“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor” (Campbell 1976, p.54)
Campbell used examples drawn from statistics on police solving crimes (p. 55), the Soviets setting numerical goals in industry (p. 57), and the U.S.’s use of “body counts” in Vietnam as evidence of winning (p.58). For public schools, Campbell said that “achievement tests are … highly corruptible indicators (p.57).”
That was nearly forty years ago. In the past decade, researchers have documented (also see here) the link between standardized test scores and narrowed instruction to prepare students for test items, instances of state policymakers fiddling with cut-off scores on tests, increased dropouts, and straight out cheating. Although how the distortions occur are unclear, the evidence confirms Campbell’s insight.
Easy To Measure Indicators Trump Hard To Measure Ones
Few in business, medicine or education question that some indicators are easier to quantify than others. In medicine, for example, hospital mortality and surgical procedures are fairly easy to measure but the results even when compared to other hospitals and surgeons hide as much as they reveal about effective health care. So it is with standardized tests.
Because test scores are inexpensive and efficient to collect, they draw attention away from important but hard-to-measure aspects of teaching and learning such as student engagement, rapport between teachers and students, academic climate in classroom and school, and principal leadership. Cumulative practitioner experience and stories about teaching over centuries have established these as crucial factors in working with gifted and vulnerable students.
These are known results of using single measures to judge individual or organizational performance. Consequences of their use can be anticipated. Historical examples abound. Some districts (e.g., Denver) wisely have moved to using multiple measures with student outcomes included that go beyond test scores but in most states where such mandates reign, test scores still remain a major part of the equation used to judge teacher performance (e.g., New York City, Washington, D.C., Houston, Texas) and allocate bonuses to teachers and principals.
This manipulation of data and one-size-fits-all measures show up in businesses as well as schools raising serious questions about the worth of this frenetic passion for pay-for-performance in both public and private sectors.
In the meantime, if Oracle’s Larry Ellison read this post in his office–say 10 minutes–he would have earned over $6,000. Ah, to be a CEO.