Tag Archives: Accountability

Bankers and Teachers: Scandals and Accountability (Part 1)

Wells Fargo, a bank that made more than $80 billion in revenue  and has a market value of $277 billion, was fined $185,000,000 by federal regulators for creating 1.5 million fake credit card accounts. In the plea bargain that regulators made with bank officials, Wells Fargo admitted no responsibility for the financial misconduct. The company had fired more than 5,000 of their lowest-paid employees but neither the senior vice-president for community banking where the fraud occurred nor the CEO lost their positions. CEO John Stumpf, named in 2013 as Morningstar’s CEO of the Year and earning about $20 million a year, did face U.S. Senate Banking Committee questions about the phony accounts last week. In testimony, the CEO did say “I take full responsibility for all of the unethical practices in our retail banking business.”A member of the Banking Committee, Senator Elizabeth Warren (Dem.-Mass) said what the bank did was a “scam” and that Stumpf “should resign… and you should be criminally investigated.”

Looking back at the fallout from the Great Recession of 2008 in lost billions of investors’ dollars, millions of home foreclosures, and crushed hopes of a generation of hard-working American retirees–apart from one senior trader at Credit Suisse who was convicted and served 30 months—not one single CEO of an investment house, bank or insurance company hip-deep in deceiving and defrauding Americans was indicted or served a day in jail.  Yes, federal regulators fined other banks like JPMorgan Chase and Bank of America billions of dollars but they like Wells Fargo admitted no unlawful conduct and took no responsibility for their actions (see here, here, here, and here). Contrast that with the savings-and-loan bank failure in the 1980s when over 1,000 bankers  went to jail for fraud and similar charges. That was then, this is now.

Immunity from accountability is currently widespread in the private sector. But not in the public sector.

Take the case of the Atlanta Public Schools and the cheating scandal between 2009-2015. Superintendent Beverly Hall led the district between 1999 and 2010. In 2009, she was named Superintendent of the Year by the American Association of School Administrators. After an investigation by state officials in 2011 triggered by the Atlanta Journal-Constitution revelations in 2009 that nearly 180 teachers and officials in 44 schools raised students’ test scores, Hall  and 31 teachers and administrators were indicted and stood trial.  Most of these educators took plea deals; Hall died of breast cancer during the trial. Eleven educators accused of tampering with students’ test scores were convicted in 2015 and are now serving from one to seven years in Georgia prisons.

No immunity from accountability here.

Lawyers and historians say often that before rushing to judgment, one must become familiar with the circumstances, the organizational setting and the mind-set of those who committed the crimes. So what were the contexts for Wells Fargo’s fraud and Atlanta’s cheating scandals?

Wells Fargo

Beginning as early as 2009, individual employees, many of whom earned less than $15 an hour, were expected to sell Wells Fargo products (e.g., credit cards, over-draft protection, checking and  savings accounts) to existing customers in order to meet their monthly goals. If they  fell short, sales representatives were written up, reprimanded or let go. Managers put intense pressure on their employees to meet these targets. Rita Murillo, a bank manager who left the company said: “We were constantly told we would end up working for McDonald’s. If we did not make the sales quotas … we had to stay for what felt like after-school detention, or report to a call session on Saturdays.”

Wells Fargo quarterly profits continued to climb in the years following the Great Recession. Investors were pleased.

As the years passed, word of bogus credit cards, checking and savings accounts and angry customers leaked out. The Los Angeles Times published an expose of the practices in 2013. The intense race to meet monthly goals created a culture where sales staff were pushed again and again to meet their targets or else. Phone calls from bosses were dreaded. After newspaper articles appeared, managers fired employees. Even after the LA Times‘ revealing of these practices and the dog-eat-dog ethos at Wells Fargo, bogus credit cards and new accounts continued.  Then state and federal regulators entered the picture. Fines were levied against Wells Fargo but not one senior executive was either admonished or forced to resign.

This is the context for Wells Fargo (see here, here, and here).

The Atlanta Public Schools

The high-poverty, mostly black district had struggled for decades with low graduation and high dropout rates and state test scores near the bottom of Georgia’s public school systems. Within the segregated district–there are a few largely white schools and the rest are largely black–academic gaps between white and black have been large and persistent (e.g., majority white Grady High School graduates 82 percent of its students while majority black Douglass High School is 42 percent).

Pressure to raise state test scores and graduation rates rose and fell as superintendents came and went in the 1990s. With the appointment of Beverly Hall in 1999 and the passage of the federal No Child Left Behind law (2002), that pressure increased considerably. Rewards and sanctions accompanied goals of raising test scores across the district. All teachers in schools meeting 70 percent of their goal, for example, would receive bonus payments. The superintendent’s contract had a similar provision for increases to her salary. Sanctions for low test performance under NCLB led to closed schools, firing principals and reprimands for district office administrators not meeting state and federal goals under Adequate Yearly Performance (AYP).

Hall was determined to improve Atlanta’s student performance. And the numbers rose over the years. Bonuses went to many schools and the superintendent. Rumors of tampering with test scores circulated and were dismissed. A number of teachers reported principals fiddling with test score results. Nothing happened except strong district office messages to be quiet or leave. A culture of fear blanketed schools. Then the Atlanta Journal Constitution investigated the rumors and published their startling report in 2009 on how much adult cheating occurred on district tests. State officials then completed their investigation in 2011 (see here).

The results of that investigation led to charging the superintendent, principals and teachers in over three dozen schools with changing student test scores. The report pointed to the high-stress placed on raising test scores and the pervasive fear among school employees of retaliation if anyone reported abuses. Some quotes from the state inquiry:

*“Throughout this investigation numerous teachers told us they raised concerns about cheating and other misconduct to their principal or SRT [School Reform Team] … only to end up disciplined or terminated.”

*“[T] message was: ‘Get the scores up by any means necessary;’ in Dr. Hall’s words, ‘No exceptions and no excuses.’”

*“In sum, a culture of fear, intimidation and retaliation permeated the APS system from the highest ranks down.”

At both Wells Fargo and in the Atlanta public schools hard-driving managerial pressures created fear-strewn workplaces where success-filled data became the goal. Similar contexts in a public and private institution turned up.

Yet accountability for fraud in these two institutions differed greatly. How come?

Part 2 tries to answer that question.

2 Comments

Filed under school leaders

Don’t Grade Schools on Grit (Angela Duckworth)

“Angela Duckworth is the founder and scientific director of the Character Lab, a professor of psychology at the University of Pennsylvania and the author of the forthcoming book “Grit: The Power of Passion and Perseverance.” This op-ed appeared in the New York Times, March 26, 2016. 

 

THE Rev. Dr. Martin Luther King Jr. once observed, “Intelligence plus character — that is the goal of true education.”

Evidence has now accumulated in support of King’s proposition: Attributes like self-control predict children’s success in school and beyond. Over the past few years, I’ve seen a groundswell of popular interest in character development.

As a social scientist researching the importance of character, I was heartened. It seemed that the narrow focus on standardized achievement test scores from the years I taught in public schools was giving way to a broader, more enlightened perspective.

These days, however, I worry I’ve contributed, inadvertently, to an idea I vigorously oppose: high-stakes character assessment. New federal legislation can be interpreted as encouraging states and schools to incorporate measures of character into their accountability systems. This year, nine California school districts will begin doing this.

Here’s how it all started. A decade ago, in my final year of graduate school, I met two educators, Dave Levin, of the KIPP charter school network, and Dominic Randolph, of Riverdale Country School. Though they served students at opposite ends of the socioeconomic spectrum, both understood the importance of character development. They came to me because they wanted to provide feedback to kids on character strengths. Feedback is fundamental, they reasoned, because it’s hard to improve what you can’t measure.

This wasn’t entirely a new idea. Students have long received grades for behavior-related categories like citizenship or conduct. But an omnibus rating implies that character is singular when, in fact, it is plural.

In data collected on thousands of students from district, charter and independent schools, I’ve identified three correlated but distinct clusters of character strengths. One includes strengths like grit, self-control and optimism. They help you achieve your goals. The second includes social intelligence and gratitude; these strengths help you relate to, and help, other people. The third includes curiosity, open-mindedness and zest for learning, which enable independent thinking.

Still, separating character into specific strengths doesn’t go far enough. As a teacher, I had a habit of entreating students to “use some self-control, please!” Such abstract exhortations rarely worked. My students didn’t know what, specifically, I wanted them to do.

In designing what we called a Character Growth Card — a simple questionnaire that generates numeric scores for character strengths in a given marking period — Mr. Levin, Mr. Randolph and I hoped to provide students with feedback that pinpointed specific behaviors.

For instance, the character strength of self-control is assessed by questions about whether students “came to class prepared” and “allowed others to speak without interrupting”; gratitude, by items like “did something nice for someone else as a way of saying thank you.” The frequency of these observed behaviors is estimated using a seven-point scale from “almost never” to “almost always.”

Most students and parents said this feedback was useful. But it was still falling short. Getting feedback is one thing, and listening to it is another.

To encourage self-reflection, we asked students to rate themselves. Thinking you’re “almost always” paying attention but seeing that your teachers say this happens only “sometimes” was often the wake-up call students needed.

This model still has many shortcomings. Some teachers say students would benefit from more frequent feedback. Others have suggested that scores should be replaced by written narratives. Most important, we’ve discovered that feedback is insufficient. If a student struggles with “demonstrating respect for the feelings of others,” for example, raising awareness of this problem isn’t enough. That student needs strategies for what to do differently. His teachers and parents also need guidance in how to help him.

Scientists and educators are working together to discover more effective ways of cultivating character. For example, research has shown that we can teach children the self-control strategy of setting goals and making plans, with measurable benefits for academic achievement. It’s also possible to help children manage their emotions and to develop a “growth mind-set” about learning (that is, believing that their abilities are malleable rather than fixed).

This is exciting progress. A 2011 meta-analysis of more than 200 school-based programs found that teaching social and emotional skills can improve behavior and raise academic achievement, strong evidence that school is an important arena for the development of character.

But we’re nowhere near ready — and perhaps never will be — to use feedback on character as a metric for judging the effectiveness of teachers and schools. We shouldn’t be rewarding or punishing schools for how students perform on these measures.

MY concerns stem from intimate acquaintance with the limitations of the measures themselves.

One problem is reference bias: A judgment about whether you “came to class prepared” depends on your frame of reference. If you consider being prepared arriving before the bell rings, with your notebook open, last night’s homework complete, and your full attention turned toward the day’s lesson, you might rate yourself lower than a less prepared student with more lax standards.

For instance, in a study of self-reported conscientiousness in 56 countries, it was the Japanese, Chinese and Korean respondents who rated themselves lowest. The authors of the study speculated that this reflected differences in cultural norms, rather than in actual behavior.

Comparisons between American schools often produce similarly paradoxical findings. In a study colleagues and I published last year, we found that eighth graders at high-performing charter schools gave themselves lower scores on conscientiousness, self-control and grit than their counterparts at district schools. This was perhaps because students at these charter schools held themselves to higher standards.

I also worry that tying external rewards and punishments to character assessment will create incentives for cheating. Policy makers who assume that giving educators and students more reasons to care about character can be only a good thing should take heed of research suggesting that extrinsic motivation can, in fact, displace intrinsic motivation. While carrots and sticks can bring about short-term changes in behavior, they often undermine interest in and responsibility for the behavior itself.

A couple of weeks ago, a colleague told me that she’d heard from a teacher in one of the California school districts adopting the new character test. The teacher was unsettled that questionnaires her students filled out about their grit and growth mind-set would contribute to an evaluation of her school’s quality. I felt queasy. This was not at all my intent, and this is not at all a good idea.

Does character matter, and can character be developed? Science and experience unequivocally say yes. Can the practice of giving feedback to students on character be improved? Absolutely. Can scientists and educators work together to cultivate students’ character? Without question.

Should we turn measures of character intended for research and self-discovery into high-stakes metrics for accountability? In my view, no.

17 Comments

Filed under testing

Judging Success and Failure of Schools and Districts: Whose Criteria Count?

The dominant standard used by most policymakers, media editors, and administrators to judge success is effectiveness: Have you done what you said you were going to do and can you prove it? In a society where “bottom lines,” Dow Jones averages, sports statistics, and vote-counts matter, quantifiable results determine success. No Child Left Behind and its focus on standardized test scores is effectiveness on steroids.

Yet even before No Child Left Behind, policymakers had relied on the effectiveness standard to examine what students have learned by using proxy measures such as state test scores, college attendance, and other indicators. For example, in the late-1970s policymakers concluded that public schools had declined because scholastic aptitudes test (SAT) scores had plunged downward. Even though test-makers and researchers repeatedly stated that such claims were false—falling SAT scores fueled public support for states raising academic requirements in the 1980s. What mattered most to decision-makers and media were numbers that could be used to establish school rankings, thereby creating easily identifiable winners and losers.

Note, however, that test results in some instances proved unhelpful in measuring a reform’s success. Consider the mid-1960s’ evaluations of Title I of the Elementary and Secondary Education Act (ESEA). They revealed little improvement in low-income children’s academic performance thereby jeopardizing Congressional renewal of the program. Such evidence gave critics hostile to federal initiatives reasons to brand President Lyndon Johnson’s War on Poverty programs as failures.

Low test scores, however, failed to diminish the program’s political attractiveness to constituents and legislators. Each successive president and Congress has used that popularity as a basis for allocating funds to needy students in schools across the nation including No Child Left Behind.

Popularity, then, is a second standard that public officials use in evaluating success. The spread of an innovation and its hold on the imagination of voters, has meant that fashionableness can translate into political support for reform. The rapid diffusion of special education, bilingual education, accountability, and computers in schools since the 1980s are instances of innovations that captured both policymakers’ and practitioners’ attention. Few educators or public officials questioned large outlays of public funds for these popular reforms because they were perceived, at least at first, as resounding successes.

A third standard used to judge success is assessing how well innovations mirrored what reformers intended. This fidelity standard assesses the fit between the initial design, the formal policy, the subsequent program, and its implementation.

Champions of the fidelity standard ask: How can anyone determine effectiveness if the reform departs from the blueprint? If federal, state, or district policymakers, for example, adopt and fund a new reading program because it has proved to be effective elsewhere, local implementers (e.g., teachers and principals) must follow the original program design as they put it into practice or else the desired outcomes will not be achieved. When practitioners add, adapt, or even omit features of the original design, then policymakers, heeding this standard, say that the policy and program cannot be determined effective because of these changes.

Where do these dominant standards of effectiveness, popularity, and fidelity come from? Policymakers derive the criteria of effectiveness and fidelity from viewing organizations as rational tools for achieving desired goals. Through top-down authority, formal structures, clearly specified roles, and technical expertise, administrators and practitioners can get the job done.

Within organizations where rational decision-making and control are prized, policymakers ask: Have the prescribed procedures been followed (fidelity) and have the goals been achieved (effectiveness)? Hence, in judging reforms, those who carry out the changes must be faithful to the design before the standard of effectiveness in achieving goals is invoked.

Popularity as a standard in judging success, of course, comes from the political domain. Schools are dependent upon taxpayers voting funds to operate schools. What voters determine is successful–regardless of the lack of or ambiguity in the evidence–gets renewed year after year.

The authority and therefore the power to put into place one or more of these criteria in the U.S. derive from the 50 states (see Tenth amendment to the U.S. Constitution). States establish local districts which directly govern its schools–there are about 14,000 districts in the U.S.   California has over 1,000 districts, Virginia has 227, and the state of Hawaii governs all of its schools as one district. States, then, set overall criteria for success. Most states choose effectiveness criteria with occasional bows to popularity and fidelity. Local districts run the schools and try to meet those criteria. Since 2002, however, federal legislation–yes, the No Child Left Behind Act–sets effectiveness criteria–test scores–for the states which then, in turn, demand that local districts adhere to that standard. The entire debate in the U.S. Congress to reauthorize NCLB has hinged upon who will have the authority to set the criteria for success, the federal or state government.

8 Comments

Filed under school reform policies

Pay-for-Performance for CEOs and Teachers

Oracle CEO Larry Ellison earns $37,692 an hour. No, that is not a typo or misplaced comma. Ellison’ annual salary ran $78.4 million, much of it in stock option awards. His salary was based on the annual performance of the company’s stock. Oracle’s Board of Directors set the pay scale (Ellison owns one-fourth of the company’s shares) to spur better management to increase profits and shareholders’ dividends.

They pay Ellison to perform well on the metric they have chosen (“company earnings before income taxes minus the costs of stock-based compensation, acquisitions, restructuring, and other items.” This CEO’s performance pay is not, however, a metric used by other major corporations for paying their top person. I return to the point of different measures used by companies to judge CEO performance later.

Switch now to the average U.S. public school teacher who earns an annual salary of over $55,000. That figure translates to around $27.00 for a 40-hour week. Like Ellison, hundreds of thousands of teachers are involved in pay-for-performance plans. In response to the federal Race To the Top competition, many states have mandated that teachers’ performance and salary be tied to students’ test scores to spur better teaching and student learning. Those test scores, as a factor in assessing effectiveness and determining salary (or bonuses), can range from as much as over half to one-quarter of the decision to set salary and retain or fire a teacher.

While I have written about this pay-for-performance reform over the past few years (see here, here, and here), for this post I want to inspect how the private sector–often a model for U.S. school reform–has its own problems, often undisclosed by business-oriented champions of school metrics, in determining CEO pay.

The lesson to learn from this post is: Paying for CEO performance in companies and schools is as flawed as the measures used to determine it.

A recent study of the metrics used in 195 large companies over the past five years showed that the most popular gauge measuring CEO performance was “total shareholder return.” Over half of the companies using that measure, however,  lost nearly two percent over the five-year period. Companies using less popular equations such as “earnings-per-share growth” gained almost three percent.

Now, here’s the clincher. Most companies judging CEO performance are relying on a metric that yielded loses for investors (“total shareholder return”)  yet,  at the same time, those very same companies continued to give their CEOs substantial raises year after year.

The authors of the study believe that the popularity of the performance measure, i.e., “total shareholder return,” stems from how easy it is for boards of directors and CEOs to manipulate the metric by “removing costs from the equation” such as “discontinuing product lines or closing factories.” Boards of directors then can reward CEOs with higher compensation packages. Earnings-per-share growth, a less popular metric and one of multiple measures that many firms use, sorts out under-performing from high performing firms, the authors found. This one as well as other measures, they concluded, are less easily manipulated by top corporate officials. CEO pay, then, can be better associated with company performance.

The main takeaways from this study is that boards of directors and CEOs do manipulate the numbers,  “one size does not fit all when measuring pay for performance,”and that multiple measures for determining effectiveness and salary have a better chance of capturing performance than single ones do.

Now, consider teacher pay-for-performance where one measure–student test scores–is often used to determine to what degree a teacher is effective. Like “total shareholder return” there are serious problems of using this metric alone or even in concert with other measures to judge teacher performance (see here and here).

Consider the following:

Incentives corrupt measures.

Since the mid-1970s, social scientists have criticized the use of specific quantitative measures to monitor or steer policies because those implementing such policies alter their practices to insure better numbers. The work of social scientist Donald T. Campbell and economists in the mid-1970s about the perverse outcomes of incentives was available but have largely been ignored. Campbell wrote in 1976.

“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor” (Campbell 1976, p.54)

Campbell used examples drawn from statistics on police solving crimes (p. 55), the Soviets setting numerical goals in industry (p. 57), and the U.S.’s use of “body counts” in Vietnam as evidence of winning (p.58). For public schools, Campbell said that “achievement tests are … highly corruptible indicators (p.57).”

That was nearly forty years ago. In the past decade, researchers have documented  (also see here) the link between standardized test scores and narrowed instruction to prepare students for test items, instances of  state policymakers fiddling with cut-off scores on tests, increased dropouts, and straight out cheating. Although how the distortions occur are unclear, the evidence confirms Campbell’s insight.

Easy To Measure Indicators Trump Hard To Measure Ones

Few in business, medicine or education question that some indicators are easier to quantify than others. In medicine, for example, hospital mortality and surgical procedures are fairly easy to measure but the results even when compared to other hospitals and surgeons hide as much as they reveal about effective health care. So it is with standardized tests.

Because test scores are inexpensive and efficient to collect, they draw attention away from important but hard-to-measure aspects of teaching and learning such as student engagement, rapport between teachers and students, academic climate in classroom and school, and principal leadership. Cumulative practitioner experience and stories about teaching over centuries have established these as crucial factors in working with gifted and vulnerable students.

*******************************************************************

These are known results of using single measures to judge individual or organizational performance. Consequences of their use can be anticipated. Historical examples abound. Some districts (e.g., Denver) wisely have moved to using multiple measures with student outcomes included that go beyond test scores but in most states where such mandates reign, test scores still remain a major part of the equation used to judge teacher performance (e.g., New York City, Washington, D.C., Houston, Texas) and allocate bonuses to teachers and principals.

This manipulation of data and one-size-fits-all measures show up in businesses as well as schools raising serious questions about the worth of this frenetic passion for pay-for-performance in both public and private sectors.

In the meantime, if Oracle’s Larry Ellison read this post in his office–say 10 minutes–he would have earned over $6,000. Ah, to be a CEO.

 

 

 

8 Comments

Filed under school reform policies

Chinese Third Graders Fall Behind U.S. Students (The Onion)*

700

CHESTNUT HILL, MA—According to an alarming new report published Wednesday by the International Association for the Evaluation of Educational Achievement, third-graders in China are beginning to lag behind U.S. high school students in math and science.

The study, based on exam scores from thousands of students in 63 participating countries, confirmed that in mathematical and scientific literacy, American students from the ages of 14 to 18 have now actually pulled slightly ahead of their 8-year-old Chinese counterparts.

“This is certainly a wake-up call for China,” said Dr. Michael Fornasier, an IEA senior fellow and coauthor of the report. “The test results unfortunately indicate that education standards in China have slipped to the extent that pre-teens are struggling to rank among even the average American high school student.”

“Simply put, how can these third-graders be expected to eventually compete in the global marketplace if they’re only receiving the equivalent of a U.S. high school education?” Fornasier added.

Fornasier stressed that while the gap is not yet dramatically sizable, it has widened over the past two years after American high schoolers tested marginally higher in algebra, biology, and chemistry than, shockingly, most of China’s 8- and 9-year-olds.

“For decades, young children in China have scored at the expected level of their peers in American high schools, so this is a very worrying drop in performance,” said Fornasier, adding that the majority of Chinese third-graders are now a full year behind the average U.S. 12th-grader in their knowledge of calculus. “In the chemistry portion of the exam, for example, Chinese children proved to be slightly deficient compared to American teenagers in their understanding of the periodic table, molecular structure, and the essential principles of atomic theory.”

“And even when they did test at the same level in mathematics, it often took Chinese elementary school students 10 to 15 minutes longer to do simple things like factor a polynomial equation or compute the derivative of a continuous function,” Fornasier added. “That just isn’t normal.”

In addition to disappointing marks from grade school children in China, 10-year-olds in Germany, South Korea, Japan, Switzerland, and New Guinea also reportedly tested an average of three percentage points lower than U.S. high school seniors in physics, with education officials from each country expressing deep concerns about the increasingly mediocre quality of their primary schools.

In light of the alarming study, many in China have called for considerable reforms of the country’s education system, including implementing far stricter standards for teachers, investing in better learning materials, and increasing the length of school days.

“Our third grade classes clearly cannot afford to lag behind American high schools if they are to be successful in the future,” read an official statement from China’s Minister of Education, Yuan Guiren. “Frankly, the scores are unacceptable, and we have to turn this around immediately. If there’s an American 17-year-old who can do something academically that a Chinese 8-year-old can’t, that’s a very big problem.”

 “At that rate, how do we expect our Chinese 13-year-olds to be ready for American colleges?” Yuan continued.

________________________

*If you have reached the end of the piece and have not yet figured out that The Onion specializes in satire, parody, and comic humor, I want readers to know that this is a fictitious article poking fun at U.S. school reformers’ obsessive focus on international test score comparisons, the supposed high quality of Chinese education and perceived low academic quality of U.S. high schools.

Thanks to Joel Westheimer for sending this piece to me.

12 Comments

Filed under school reform policies

Buying iPads, Common Core Standards, and Computer-Based Testing

The tsunami of computer-based testing for public school students is on the horizon. Get ready.

For adults, computer-based testing has been around for decades. For example, I have taken and re-taken the California online test to renew my driver’s license twice in the past decade. To get certified to drive as a volunteer driver for Packard Children’s Hospital in Palo Alto, I had to read gobs of material about hospital policies and federal regulations on confidentiality before taking a series of computer-based tests. To obtain approval from Stanford University for a research project of which I am the principal investigator and where I would interview teachers and observe classrooms, I had to read online a massive amount of material on university regulations about consent of subjects to participate, confidentiality, and handling of information gotten from interviews and classroom observations.  And again, I took online tests that I had to pass in order to gain approval from the University to conduct research.  Beyond the California Department of Motor Vehicles, Children’s Hospital, and Stanford University, online assessment has been a staple in the business sector from hiring through employee evaluations.  So online testing is already part of adult experiences

What about K-12 students?  Increasingly, districts are adopting computer-based testing. For example, Measures of Academic Progress, a popular test used in many districts is online. Speeding up this adoption of computer-based testing is the Common Core Standards and the two consortia that are preparing assessments for the 45 states on the cusp of implementing the Standards. Many states have already mandated online testing for their own standardized tests to get prepared for impending national  assessments. These tests will require students to have access to a computer with the right hardware, software, and bandwidth to accommodate online testing by 2014-2015 (See here, here, and here).

There are many pros and cons with online testing as, say, compared with paper-and-pencil tests. But whatever those pros are for paper-and-pencil tests, they are outslugged and outstripped by the surge of buying new devices and piloting of computer-based tests to get ready for Common Core assessments (see here and here). Los Angeles Unified school district, the second largest in the nation, just signed a $50 million contract with Apple for  iPads. One of the key reasons to buy these devices for the initial rollout for 47 schools was Common Core standards and assessment. Each iPad comes with an array of pre-loaded software compatible with the state online testing system and impending national assessments. The entire effort is called The Common Core Technology Project.

The best (and most recent) gift to the hardware and software industry has been the Common Core standards and assessments. At a time of fiscal retrenchment in school districts across the country when schools are being closed and teachers are let go, many districts have found the funds to go on shopping sprees to get ready for the Common Core.

And here is the point that I want to make. The old reasons for buying technology have been shunted aside for a sparkling new one. Consider that for the past three decades the rationale for buying desktop computers, laptops, and now tablets has been three-fold:

1. Make schools more efficient and productive so that students learn more, faster, and better than they had before.

2. Transform teaching and learning into an engaging and active process connected to real life.

3. Prepare the current generation of young people for the future workplace.

After three decades of rhetoric and research, teachers, principals, students, and vendors have their favorite tales to prove that these reasons have been achieved. But for those who want more than Gee Whiz stories, who seek a reliable body of evidence that shows students learning more, faster, and better, that shows teaching and learning to have been transformed, that using these devices have prepared the current generations for actual jobs—well, that body of evidence is missing for each of these traditional reasons to buy computers.

With Common Core standards adopted, the rationale for getting devices has shifted. No longer does it  matter whether there is sufficient evidence to make huge expenditures on new technologies. Now, what matters are the practical problems of being technologically ready for the new standards and tests in 2014-2015: getting more hardware, software, additional bandwidth, technical assistance, professional development for teachers, and time in the school day to let students practice taking tests.

Whether the Common Core standards will improve student achievement–however measured–whether students learn more, faster, and better–none of this matters in deciding on which vendor to use. It is not whether to buy or not. The question is: how much do we have and when can we get the devices. That is tidal wave on the horizon.

17 Comments

Filed under technology, testing

Cartoons from Robert Rendo

For this month’s feature on cartoons*, I chose a selection from Robert Rendo. We met through my blog and he sent along a sampling of his cartoons from which I selected some  on students and testing. He sent me the following description of himself.

Robert Rendo grew up in New Hyde Park, New York. He lives in New York City and in Massachusetts with his wife Rachel, an educator and children’s clothing designer.

Robert Rendo is an editorial illustrator and president of PoliticalCartoonsOnline.com. A native New Yorker, Mr. Rendo’s  work has earned him publication in the Op/Ed section of the New York Times, the Chicago Tribune, and the Sacramento Bee. Recently, Rendo designed the logo, masthead, and branding for educational historian Diane Ravitch’s Network for Public Education. His images can also be found on Stephen Krashen’s blog, Education Notes Online, and the blog “Susan Ohanian Speaks Out”.

“I hope to provoke all readers,” says Rendo. “Editorial art is a precise genre.”

When asked how he foresees illustration in media, Rendo said, “I think it’ll be shared more equally between hardcopy and the internet. For me, It’s more satisfying to hold a periodical in your fingers and turn the pages. Whatever the format, illustration will always help chronicle the follies of man. And we humans screwing up never seems to be out of vogue. “

C59 copy

C119 copy 1

C17 copy

education 2 copy 5

C63 copy 3

C51 copy copy

fat cat 1 copy 3

_________________

In previous months, the following cartoons have been posted:  “Digital Kids in School,” “Testing,” “Blaming Is So American,”  “Accountability in Action,” “Charter Schools,” and “Age-graded Schools,” Students and Teachers, Parent-Teacher Conferences, Digital Teachers, Addiction to Electronic DevicesTesting, Testing, and Testing, Business and Schools, Common Core Standards, Problems and Dilemmas, Digital Natives (2),  Online Courses,  , Students and Teachers Again, “Doctors and Teachers,Parent/teacher conferences, Preschools,”and “Life at Lincoln Middle School.”

8 Comments

Filed under school reform policies

Testing, Testing, and Testing: More Cartoons

The U.S. has tests galore. Driving, alcohol, steroids, DNA, citizenship, blood,  pregnancy–and on and on. Most serve a specific purpose and carry personal consequences if one passes or fails. School tests, however, to pass a course, to be promoted to another grade, to graduate and to judge whether the school is satisfactory or on probation have proliferated dramatically in the past three decades. Opinions are split among Americans about these tests.

Surveys report that most teachers (but by no means all) believe that there is too much standardized testing. Some parents have mobilized to boycott annual tests. Most respondents to opinion polls, however, support curriculum standards, accountability, and, yes, state tests.

Of the many cartoons on testing that I have located, most reflect the opinion that there is too much testing and too much is made of the results. I have found very few–none that I can recall or that I have posted–endorsing standardized tests. Here is a sampling of those cartoons.

For those readers who wish to see previous monthly posts of cartoons, see: “Digital Kids in School,” “Testing,” “Blaming Is So American,”  “Accountability in Action,” “Charter Schools,” and “Age-graded Schools,” Students and Teachers, Parent-Teacher Conferences, Digital Teachers, and Addiction to Electronic Devices.

12 Comments

Filed under testing

Remembering Test Scores and Learning about Regression toward the Mean

Here is a story about test scores. I was superintendent of the Arlington (VA) public schools between 1974-1981. In 1979 something happened that both startled me and gave me insight into the public power of test scores. The larger lesson, however, came years after I left the superintendency when I began to understand the powerful drive that we have to explain something, anything, by supplying a cause, any cause, just to make sense of what occurred.

In Arlington then, the school board and I were responsible for a district that had declined in population (from 20,000 students to 15,000) and had become increasingly minority (from 15 percent to 30). The public sense that the district was in free-fall decline, we felt, could be arrested by concentrating on academic achievement, critical thinking, expanding the humanities, and improved teaching. After five years, both the board and I felt we were making progress.

State  test scores–the coin of the realm in Arlington–at the elementary level climbed consistently each year. The bar charts I presented at press conferences looked like a stairway to the stars and thrilled school board members. When scores were published in local papers, I would admonish the school board to keep in mind that these scores were  a very narrow part of what occurred daily in district schools. Moreover, while scores were helpful in identifying problems, they were largely inadequate in assessing individual students and teachers. My admonitions were generally swept aside, gleefully I might add, when scores rose and were printed school-by-school in newspapers. This hunger for numbers left me deeply skeptical about standardized test scores as signs of district effectiveness.

Then along came  a Washington Post article in 1979 that showed Arlington to have edged out Fairfax County, an adjacent and far larger district, as having the highest Scholastic Aptitude Test (SAT) scores among eight districts in the metropolitan area (yeah, I know it was by one point but when test scores determine winners  and losers in a horserace, Arlington had won by a nose).

I knew that SAT results had nothing whatsoever to do with how our schools performed. It was a national standardized instrument to predict college performance of individual students; it was not constructed to assess district effectiveness. I also knew that the test had little to do with what Arlington teachers taught. I told that to the school board publicly and anyone else who asked about the SATs.

Nonetheless, the Post article with the box-score of  test results produced more personal praise, more testimonials to my effectiveness as a superintendent, and, I believe, more acceptance of the school board’s policies than any single act during the seven years I served. People saw the actions of the Arlington school board and superintendent as having caused those SAT scores to outstrip other Washington area districts.

That is what I remember about the test scores in Arlington and that Post article in 1979.

Since then, I have learned about “regression toward the mean.” It was an eye-opener. Here’s a psychologist who defines regression toward the mean as “random fluctuations in the quality of performance” meaning that both luck and skill are involved but randomness is the key.

In sports, examples of this statistical concept are those athletes whose rookie year is outstanding and then they slump in their second year; best selling debut novelists write a subsequent one that tanks; hot TV shows soar in their initial season and then get low ratings the next year. They “regress to the mean” or average.

Another example from Wikipedia:

“A class of students takes two editions of the same test on two successive days….[T]he worst performers on the first day will tend to improve their scores on the second day, and the best performers on the first day will tend to do worse on the second day. The phenomenon occurs because student scores are determined in part by underlying ability and in part by chance. For the first test, some will be lucky, and score more than their ability, and some will be unlucky and score less than their ability. Some of the lucky students on the first test will be lucky again on the second test, but more of them will have (for them) average or below average scores. Therefore a student who was lucky on the first test is more likely to have a worse score on the second test than a better score. Similarly, students who score less than the mean on the first test will tend to see their scores increase on the second test.”

Because our mind loves causal explanations, we say that those students, those athletes, those novelists performed well and then had a bad year because their smarts and skills deteriorated. Instead of realizing and acknowledging that with regression toward the mean, good performance is usually followed by poor performance (and vice versa) not because of talent and skill failing but because of luck and the “inevitable fluctuations of a random process.”

And that is how I came to see that the one-point victory that Arlington achieved in the SATs in 1979 was not the school board and superintendent efforts but an instance of luck and the statistical chances embedded in regression toward the mean.

4 Comments

Filed under school leaders, school reform policies

Accountability in Action–Cartoons

More than any other word, “accountability” has become the keyword defining the past quarter-century in private and public sectors of life in America.  Presidents, governors, and mayors say that they answer to voters. CEOs and top managers proudly display their accountability to their boards of trustees. Small and mid-size owners of companies know that they are accountable to their customers. Appointed leaders and bureaucrats point to the outcomes they must meet in their evaluations. Or pay the consequences. So let’s call these political, market, and bureaucratic forms of accountability.

Anyone in K-12 or higher education knows that accountability is (and has been for decades) the magic word that opens doors for aspiring leaders and shows the exit to low-performing employees. For these institutions, “accountability imposes six demands” on educators at all levels that overlap these different versions of the accountability pervasive in the U.S.

“First, they must demonstrate that they have used their powers properly. Second,
they must show that they are working to achieve the mission or priorities set for their office or organization. Third, they must report on their performance, for ‘power is opaque, accountability is public’ … Fourth, the two “E” words of public stewardship—efficiency and effectiveness—require accounting ‘for the resources they use and the outcomes they create….’ Fifth, they must ensure the quality of the programs and services produced. Last, but far from least, they must show that they serve public needs.”

There are, then, political, market, and bureaucratic forms of  accountability across private and public sectors in the U.S. including  K-12 education.  Schools are political inventions approved by voters and taxpayers charged to carry out national and individual goals; with parental choice readily available a version of customers buying in a market economy has developed in U.S. schooling, and, well, for bureaucratic accountability, K-12 schools in urban, suburban, and rural districts are hierarchical, rule driven, and constantly reporting to superiors as well as being evaluated.

I found a sampling of cartoons that illustrate humorously and, at times, harshly, various features of accountability across public and private institutions.

POLITICAL ACCOUNTABILITY

MARKET ACCOUNTABILITY

EDUCATIONAL ACCOUNTABILITY

If readers come across other cartoons that cause chuckles or pinch (or both) on the different forms of accountability, please send them along.

7 Comments

Filed under Reforming schools