Tag Archives: research and practice

The Seductive Lure of Big Data: Practitioners Beware

Big Data beckons policymakers, administrators and teachers with eye-popping analytics and snazzy graphics. Here is Darrell West of the Brookings Institition laying out the case for teachers and administrators to use Big Data:

Twelve-year-old Susan took a course designed to improve her reading skills. She read short stories and the teacher would give her and her fellow students a written test every other week measuring vocabulary and reading comprehension. A few days later, Susan’s instructor graded the paper and returned her exam. The test showed that she did well on vocabulary, but needed to work on retaining key concepts.

In the future, her younger brother Richard is likely to learn reading through a computerized software program. As he goes through each story, the computer will collect data on how long it takes him to master the material. After each assignment, a quiz will pop up on his screen and ask questions concerning vocabulary and reading comprehension. As he answers each item, Richard will get instant feedback showing whether his answer is correct and how his performance compares to classmates and students across the country. For items that are difficult, the computer will send him links to websites that explain words and concepts in greater detail. At the end of the session, his teacher will receive an automated readout on Richard and the other students in the class summarizing their reading time, vocabulary knowledge, reading comprehension, and use of supplemental electronic resources.

In comparing these two learning environments, it is apparent that current school evaluations suffer from several limitations. Many of the typical pedagogies provide little immediate feedback to students, require teachers to spend hours grading routine assignments, aren’t very proactive about showing students how to improve comprehension, and fail to take advantage of digital resources that can improve the learning process. This is unfortunate because data-driven approaches make it possible to study learning in real-time and offer systematic feedback to students and teachers (education technology west-1).

West sees teachers and administrators as data scientists mining information, tracking individual student and teacher performance and making subsequent changes based on the data. Unfortunately, so much of the hype for using Big Data ignores time, place, and people.

Context matters.

Consider what occurred when Nick Bilton, a New York University journalist and adjunct professor designed a project for his graduate students in a course called “Telling Stories with Data, Sensors, and Humans.” Could sensors, Bilton and students asked, be reporters, collect information, and tell what happened?

The students built small electronic machines with sensors that could detect motion, light, and sound. They then asked the straightforward question whether students in the high-rise classroom building used the elevators more than the stairs  and whether they shifted from one to the other during the day. They set the device in some elevators and stairwells. Instead of a human counting students, a machine did.

Bilton and his graduate students were delighted with the results. They found that students seemed to use the elevators in the morning “perhaps because they were tired from staying up late, and switch to the stairs at night, when they became energized.”

That night when Bilton was leaving the building, the security guard who watched students set up the devices in elevators asked him what happened with the experiment. Bilton said that the sensors had captured students taking elevators in morning and stairs at night. The security guard laughed and told Bilton: “One of the elevators broke down a few evenings last week, so they had no choice but to use the stairs.”

Context matters.

In mining data, using analytics, and reading dashboards (see DreamBox) for classrooms and schools, the setting, time, and the quality of adult-student relationships count also. For Darrell West and others who see teachers and students profiting from instantaneous feedback from computers, context is absent. They fail to consider that the age-graded school is required to do far more than stuff information into students. They fail to reckon with the age-old wisdom (and research to support it) that effective student learning beyond test scores resides in the relationship between student and teacher.

And when it comes to evaluating individual teachers on the basis of student test scores, the  context of teaching–as complex an endeavor as can be imagined, one that is only partially mapped by researchers–trumps Big Data even when it is amply funded by Big Donors.

Big Data, of course, will be (and is) used by policymakers and administrators for tracking school and district performance and accountability. But the seductive lure of mining data and creating glossy dashboards will entice many educators to grab numbers to shape lessons and judge individual students and teachers. If they do succumb to the seduction without considering the complex context of teaching and learning, they risk making mistakes that will harm both teachers and students.

18 Comments

Filed under school reform policies

Evidence: The Case of the Common Core Standards

I have admired Rodin’s statue of “The Thinker” for many years.

Yet the statue is not a man of action.

Too much thinking, too little action is a recipe for fecklessness. Yet too much action, too little thought are ingredients for a potential disaster.*

And this is where the Common Core standards enter the picture.

Exactly how much evidence did policymakers have to justify the crafting and adoption of national standards?  Of that evidence supporting the policy, what part, if any, did research play in making policy? Since evidence never speaks for itself–it has to be interpreted–these are fair questions to ask of any policy but especially one with high-stakes consequences for how teachers teach to the standards, what children and youth study in classrooms lessons, and tests used to measure how much of the standards students have learned.

There have been two major justifications for Common Core standards: (1) raising academic standards across U.S. schools will grow the economy and make the nation globally competitive; (2) higher standards will improve students’ academic achievement. After parsing these reasons for the Common Core standards, I then turn to the evidence used by policymakers and practitioners and where research studies fit (or do not fit) into the policymaking process.

1. What evidence is there that common standards will increase a nation’s global competitiveness?

Answer: None. Zip. Nada.

See here and here.

2. What evidence is there that national standards will improve student achievement on domestic and international tests?

Answer: None. Zip. Nada.

See here, here and even here.

So how can a public policy that has heavy consequences for students, teachers, and public schools have an appalling lack of evidence?

The answer is in what top decision-makers consider as evidence when they determine policy. Or the answer is in the simple fact that policies get made for many reasons, only one of which may be evidence, including research studies. I take up each of these explanations.

First, what do policymakers consider to be evidence? Generally, school boards, state and federal officials, and practitioners–teachers and principals–have a broader definition of evidence than do researchers who rely upon the results of randomized control studies, rigorously conducted case studies, and carefully constructed interventions in schools (see Tseng-Social-Policy-Report-2012-1).

In an ongoing study of school boards’ decision-making, for example, Robert Asen and colleagues found that local policymakers drew from many sources for “evidence.” They relied on first-hand experiences,   systematically collected data on conditions, testimony of authoritative individuals and groups, specific examples that illustrated the policy issue being discussed, and, yes, they used empirical findings culled from researchers. What constitutes evidence to school board members was a broad array of experience-produced and research-produced knowledge, some carrying more weight than others in each policymaker’s mind. Few decision-makers, however, say that research findings guided their actions (coburnhonigsteinfinal-1).

Second, what drives policymaker decisions? Many reasons propel policy and evidence is only one of those reasons. Consider that financial and political pressures push policy without any reference to “what the research says.” When drug abuse or teenage pregnancies rise in a community, parents and politicians lobby school boards to initiate or revamp drug and sex education programs–regardless of what research studies say about the effectiveness of such programs. Or to cite another example, when a program becomes controversial such as “Man: A Course of Study” in the 1970s, studies of its effectiveness are disregarded as pressure groups got school boards to dump the program.

Or consider evaluating teachers on the basis of student test scores–one of the public reasons given for Chicago teachers striking this month. U.S. Secretary of Education Arne Duncan had, at best, contested   research findings a few years ago when he required performance evaluations to be included in state proposals for Race To The Top funds. Or even now. Political considerations mattered, not the amount or quality of evidence.

In an economic recession when state revenues shrink, districts cut staff and programs without checking research studies to determine which programs or staff were effective.

So local, state, and federal policymakers have a broader view of what constitutes evidence–practitioners even more so–than researchers. Research studies play a minor role, if at all, in making most significant policy decisions.

Which, of course, brings me around full circle to Common Core standards which is a train carrying few research studies that has left the station on its way into the nation’s classrooms. Believe me, that train is not carrying statues of Rodin’s “The Thinker.”

____________

*Thanks to Joel Westheimer for sending me the “Thinker and Doer” cartoon

56 Comments

Filed under school reform policies

Yet Again: Principals as Instructional Leaders

The constant chatter that principals should be innovative and tough-minded instructional leaders, on-top-of-everything CEOs, and smooth political tacticians reminds me of a photo* sent to me by a fellow blogger in Turkey.

I have written numerous times on the DNA of principaling and how  three roles–managing, instructing, and politicking–are essential to the daily work of principals. Researchers have observed elementary and secondary principals over the past century and documented time and again that most of their daily activities (at least half) are spent in administrative tasks. Managing a building, staff, children and youth, parents, central office officials, external agencies and companies doing business with the school consumes big chunks of time. And that is just to keep the place working and on course for teachers to teach and students to learn.

Principals reading the last paragraph would probably nod in agreement and could add activities that I omitted.

Of course, facts have little to do with ideology and the latest reform. For the past few decades, but especially since the federal law, No Child Left Behind, was passed, reform-minded academics and principal associations have advocated that the instructional leader is the primary role that principals  have to perform if schools are to do well academically–especially in urban districts where poor performance is pervasive. The key to  registering higher test scores, promoters of instructional leadership claim, is for the principal to lead teachers in designing the instructional program, coach teachers, do drop-in visits daily to classrooms, teach an occasional lesson, and evaluate how well (or poorly) teachers do over the 180 days of instruction. But as the photo of the rocket strapped to the Basset Hound says: “not everything new and shiny works.”

A recent report ( Shadow Study Miami-Dade Principals) of what 65 principals did each day during one week in 2008 in Miami-Dade county (FLA) shows that even under NCLB pressures for academic achievement and the widely accepted (and constantly spouted) ideology of instructional leadership, Miami-Dade principals spend most of their day in managerial tasks that influence the climate of the school but may or may not affect daily instruction. What’s more, those principals who spend the most time on organizing and managing the instructional program have test scores and teacher and parental satisfaction results  that are higher than those principals who spend time coaching teachers and popping into classroom lessons.

The researchers shadowed these elementary and secondary principals and categorized their activities minute-by-minute through self-reports, interviews, and daily logs kept by the principals.

In the academic language of the study:

The authors find that time spent on Organization Management activities is associated with positive school outcomes, such as student test score gains and positive teacher and parent assessments of the instructional climate, whereas Day-to-Day Instruction activities are marginally or not at all related to improvements in student performance and often have a negative relationship with teacher and parent assessments. This paper suggests that a single-minded focus on principals as instructional leaders operationalized through direct contact with teachers may be detrimental if it forsakes the important role of principals as organizational leaders (p. iv)

Two things jump out of this study for me. First, the results of shadowing principals in 2008 mirror patterns in principal work that researchers have found since the 1920s although the methodologies of time-and-motion studies have changed. Second, there is an association–a correlation, by no means a cause-effect relationship–between principals who spend more time managing the organization and climate of the school than those principals who spend time in direct contact with teachers in classrooms.

One study, of course, will not lower the volume or temper the rhetoric of principal-as-instructional-leader. But that study does bring into perspective that putting goggles and a rocket on a Basset Hound won’t make it fly any more than hyping the role of instructional leadership will make principals better at their jobs.

____________________

*Tony Gurr a blogger who is an educational consultant in Ankara, Turkey, sent me a range of graphics that included this photo. No source was provided.

25 Comments

Filed under Reforming schools, school leaders

“Why Do Good Policy Makers Use Bad Indicators?”*

Test scores are the coin of the educational realm in the U.S.. In No Child Left Behind, they are used to reward and punish districts, schools, and teachers for how well or poorly students score on state tests. In pursuit of federal dollars, The Race To The Top competition has shoved state after state into legislating that teacher evaluations include student test scores as part of judging teacher effectiveness.

Numbers glued to high stakes consequences, however, corrupt performance. Since the mid-1970s, social scientists have documented the untoward results of attaching high stakes to quantitative indicators not only for education but also across numerous institutions. They have pointed out that those who implement policies using specific quantitative measures will change their practices to insure better numbers.

The work of social scientist Donald T. Campbell and others about the perverse outcomes of incentives was available and known to many but went ignored. In Assessing the Impact of Planned Social Change, Campbell wrote:

“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor” (p. 49).

Campbell drew instances of distorted behavior when police officials used clearance rates in solving crimes, the Soviets set numerical goals for farming and industry, and when the U.S military used “body counts” in Vietnam as evidence of winning the war.

That was nearly forty years ago. In the past decade, medical researchers have found similar patterns when health insurers and Medicare have used quantitative indicators to measure physician performance. For example, Medicare requires—as a quality measure—that doctors administer antibiotics to a pneumonia patient within six hours of arriving at the hospital. As one physician said: “The trouble is that doctors often cannot diagnose pneumonia that quickly. You have to talk to and examine the patient and wait for blood tests, chest X-rays and so on.” So what happens is that “more and more antibiotics are being used in emergency rooms today, despite all-too-evident dangers like antibiotic-resistant bacteria and antibiotic-associated infections.” He and other doctors also know that surgeons have been known to pick reasonably healthy patients for heart bypass operations and ignore elderly ones who have 3-5 chronic ailments to insure that results look good.

More examples.

TV stations charge for advertising on the basis of how many viewers they have during  “sweep” months (November, February, May, and July). Nielsen company has boxes in two million homes (representative of the nation’s viewership) that register whether the TV is on and what families are watching during those months. They also have viewers fill out diaries. Nielsen assumes that what the station shows in those months represents programming for the entire year (see 2011-2012-Sweeps-Dates). Nope. What TV networks and cable companies do is that during those “sweeps” they program new shows, films, extravaganzas, and sports that will draw viewers so they can charge higher advertising rates. They game the system and corrupt the measure (see p. 80).

And just this week, ripped from the headlines of the daily paper, online vendors secretly ask purchasers  of their products to write reviews and rate it with five stars in exchange for a kickback of the price the customer paid. Another corrupted measure.

Of course, educational researchers also have documented the link between standardized test scores and narrowed instruction to prepare students for test items, instances of state policymakers fiddling with cut-off scores on tests, increased dropouts, and straight out cheating by a few administrators. (see Dan Koretz, Measuring Up).

What Donald Campbell had said in 1976 about “highly corruptible indicators” applies not only in education but also to many different institutions.

So why do good policy makers use bad indicators? The answer is that numbers are highly prized in the culture because they are easy to grasp and use in making decisions.The simpler the number–wins/losses, products sold, profits made, test scores– the easier to judge worth. When numbers have high stakes attached to them, they then become incentives (either as a carrot or a stick) to make the numbers look good. And that is where  indicators turn bad as sour milk whose expiration date has long passed.

The best policymakers, not merely good ones, know that multiple measures for a worthy goal reduce the possibility of reporting false performance.


*Steven Glazerman and Liz Potamites, False Performance Gains: A Critique of Successive Cohort Indicators,” Working Paper, Mathematica Policy Research, December 2011, p. 13.

8 Comments

Filed under Reforming schools

Bias toward Numbers in Judging Teaching

New fuel economy label in 2008 shows estimated...

Image via Wikipedia

In the U.S. people—yes, I include myself here—making decisions about important issues such as buying a home, picking a school for a five year-old or deciding on a college often give more weight to those features carrying numbers with them rather than qualitative features without numbers. Say, focusing on the square footage in the house vs. the feel of roominess. Or a teacher-student ratio in a kindergarten vs. sense of family that children and teacher communicate. From unemployment figures to batting averages and pass interceptions to calories, numbers carry far more weight with Americans than those variables that are harder to measure. Jonah Lehrer makes this point in one of his postings.

“Buying a car is a hard decision. There are just so many variables to think about. We’ve got to inspect the interior and analyze the engine, and research the reliability of the brand. And then, once we’ve amassed all these facts, we’ve got to compare different models.

How do we sift through this excess of information? When consumers are debating car alternatives, studies show that they tend to focus on variables they can quantify, such as horsepower and fuel economy…. We do this for predictable reasons. The amount of horsepower directly reflects the output of the engine, and the engine seems like something that should matter. (Nobody wants an underpowered car.) We also don’t want to spend all our money at the gas station, which is why we get obsessed with very slight differences in miles per gallon ratings.

Furthermore, these numerical attributes are easy to compare across cars: All we have to do is glance at the digits and see which model performs the best. And so a difficult choice becomes a simple math problem.

Unfortunately, this obsession with horsepower and fuel economy turns out to be a big mistake. The explanation is simple: The variables don’t matter nearly as much as we think. Just look at horsepower: When a team of economists analyzed the features that are closely related to lifetime car satisfaction, the power of the engine was near the bottom of the list. (Fuel economy was only slightly higher.) That’s because the typical driver rarely requires 300 horses or a turbocharged V-8. Although we like to imagine ourselves as Steve McQueen, accelerating into the curves, we actually spend most of our driving time stuck in traffic, idling at an intersection on the way to the supermarket. This is why, according to surveys of car owners, the factors that are most important turn out to be things like the soundness of the car frame, the comfort of the front seats and the aesthetics of the dashboard. These variables are harder to quantify, of course. But that doesn’t mean they don’t matter.”

Switch channels from buying cars to determining teacher effectiveness. Judging teacher effectiveness now means using multifactor algorithms with quantifiable variables including test scores and observers’ ratings while avoiding qualitative judgments about teacher practices that are hard to quantify. Examples: interviewing students after a teacher has praised their effort and persistence and seeing them glow. Or listening to students who remember teachers who applauded their self-control in difficult classroom situations. Or watching students in a class struggle with a problem that the teacher gave them that had no right answer to it.  Or see students who honored their favorite teachers by emulating  them as adults. Teacher blogger Stephen Lane  makes a similar point about the lack of metrics for things that really matter.

I end with Jonah Lehrer’s example that makes the same point vividly.

“When asked by David Remnick, in a 2000 New Yorker profile, how he felt about a cramped literary interpretation of one of his novels, Roth busted out a sports analogy. He imagined going to a baseball game with a little boy for the very first time. The kid doesn’t understand what’s happening on the field, and so his dad tells him to watch the scoreboard, to keep track of all the changing numbers. When the boy gets home someone asks him if he had fun at the game:

‘It was great!” he says. ‘The scoreboard changed thirty-two times and Daddy said last game it changed only fourteen times and the home team last time changed more times than the other team. It was really great! We had hot dogs and we stood up at one point to stretch and we went home.’ “

But, of course, the boy would have missed the point of baseball.

And all the complex algorithms used in current plans to judge teacher performance too often ignore the hard-to-quantify variables that students, teachers, and parents value and remember years later.

8 Comments

Filed under how teachers teach

Reforming the Science Curriculum Yet Again (Part 1)

“Creating new curriculum standards for science doesn’t reform teaching and learning any more than standing in a garage makes you a car or a truck.”*

The top research body in the U.S., the National Research Council, recently released its Framework for K-12 Science Education. An 18-member committee of top scientists and educational experts drawn from the National Academy of Sciences identified key concepts, scientific practices, and ideas that every student should learn by the time they graduate high school. It is intended as a guide for those who are now developing national Common  Core Standards in science (Standards in English Language Arts and Math are already out and 44 states have already adopted both).

As I read the report, two thoughts occurred to me. First, because of overlap in the players who created the Framework and those who are working on new science Standards the Framework is a preview of coming attractions for an intended science curriculum. I say “intended” because once states adopt the Common Core Standards in English and Math–with science next in line– the Standards, hullabaloo over a national curriculum notwithstanding, will not exactly mirror the science content teachers will teach once they close their classroom doors. Moreover, the science that students learn in those classrooms will vary from what the Standards contain and what teachers teach. Finally, what gets tested in national assessments of English, math, and science will differ from what teachers have taught and what students have learned. I elaborate this point of policy-to-practice in this post.

The second thought I had was how familiar the Framework was to me insofar as previous revisions of science curriculum over the past century. I will discuss cycles of science curricula in Part 2.

National curriculum frameworks as an instance of policy-to-practice**

The intended (or official) curriculum is what state and district officials set forth in curricular frameworks and courses of study. Were the science Framework to be adopted in part or wholly as another Core Curriculum Standard by states and districts in upcoming years, parents and school board officials would  expect teachers to teach it; further, they would assume students will learn it. These official curricula increasingly are aligned with state-approved textbooks that teachers are directed to use and state-mandated tests that teachers must administer.

But teachers, working alone in their rooms, choose what to teach and how to present it. Their choices derive from their knowledge of the subject they teach (elementary and secondary school teachers differ greatly in their knowledge of science), their experiences in teaching the content, their affection or dislike for topics, and their attitudes toward the students they face daily. In fact, researchers continually find that teachers in the same building will teach different versions of the same course. Thus, the intended curriculum and what teachers teach may overlap in the title of the course, certain key topics, and the same text, but can differ substantially in actual subject matter and daily lessons. And also students differ in what they learn.

The taught curriculum overlaps with but differs significantly from what students take away from class. Students pick up information and concepts from lessons. They also learn to answer teacher questions, recite, review material, locate sources, seek help, avoid teachers’ intrusiveness, and act attentive. Collateral learnings, in Dewey’s phrase, occur when children pick up ideas from class-mates, copy their teachers’ habits and tics, imitate their humor or sarcasm, or strive to be as autocratic or democratic as the adults. So, the learned curriculum differs from the intended and taught curricula.

And what students learn does not exactly mirror what is in the tested curriculum. Classroom, school, district, state, and national tests, often using multiple-choice and other short-answer  items, do, indeed, capture much–but hardly all–of the official and taught curricula. To the degree that teachers attend to such tests, portions of the intended and taught curricula merge. But what is tested is a limited part of what is intended by policymakers, taught by teachers, and learned by students. Since so many of these tests seek to sort high achieving students from their lower-achieving  peers, the information, ideas, and skills sought on these tests represent an even narrower band of knowledge.

The newly-published science Framework, then, much of which I expect to appear when the Core Standard in science eventually arrives, will be only the initial link in the policy-to-practice chain of intended-taught-learned-tested curricula that characterizes U.S. schooling. The additional links in that chain have to be accounted for because reforming science teaching and learning is far more complicated than standing in a garage and hoping to become a car or truck.

__________

*I made up the quote.

**Much of what follows on four different curricula (intended, taught, learned, tested) is drawn from The Hidden Variable. Citations and references are listed in the article.

3 Comments

Filed under school reform policies

Instead of Focusing on What Students Don’t Know, What Do They Know?

Another piece of evidence that students have not learned (or have forgotten) their science, math, and social studies made a recent splash in the media. This year it is civics. Two decades ago it was history. Diane Ravitch and Chester Finn published in 1988 What Do Our 17 year-olds Know in History and Literature. Their answer: not much.

This focus on how little each generation of students (and adults) know about academic subjects has become a popular ritual–dating back to 1943–that symbolizes–no surprise here–how inadequate U.S. schools are in transmitting to the next generation knowledge, skills, and values held to be essential in a democracy.

Perhaps a better question to ask is not what do students forget or haven’t learned in school but what do students know. Stanford University’s Sam Wineburg and University of Maryland’s Chauncey Monte-Santo, asked precisely that question when surveying students a few years ago. Here is the article that appeared in 2008.

Let’s begin with a brief exercise. Who are the most famous Americans in history, excluding presidents and first ladies? ….

A colleague and I recently put this question to 2,000 11th and 12th graders from all 50 states, curious to see whether they would name (as a great many educators had predicted) the likes of Paris Hilton, Britney Spears, Tupac Shakur, 50 Cent, Barry Bonds, Kanye West or any number of other hip-hop artists, celebrities or sports idols. To our surprise, the young people’s answers showed that whatever they were reading in their history classrooms, it wasn’t People magazine. Their top ten names were all bona fide historical figures.

To our even greater surprise, their answers pretty much matched those we gathered from 2,000 adults age 45 and over. From this modest exercise, we deduced that much of what we take for conventional wisdom about today’s youth might be conventional, but it is not wisdom. Maybe we’ve spent so much time ferreting out what kids don’t know that we’ve forgotten to ask what they do know.

Chauncey Monte-Sano of the University of Maryland and I designed our survey as an open-ended exercise. Rather than giving the students a list of names, we gave them a form with ten blank lines separated by a line in the middle. Part A came with these instructions: “Starting from Columbus to the present day, jot down the names of the most famous Americans in history.” There was only one ground rule—no presidents or first ladies. Part B prompted for “famous women in American history” (again, no first ladies). Thus the questionnaire was weighted toward women, though many kids erased women’s names from the first section before adding them to the second. But when we tallied our historical top ten, we counted the total number of times a name appeared, regardless of which section.

Of course a few kids clowned around, but most took the survey seriously. About an equal number of kids and adults listed Mom; from adolescent boys we learned that Jenna Jameson is the biggest star of the X-rated movie industry. But neither Mom nor Jenna was anywhere near the top. Only three people appeared on 40 percent of all questionnaires. All three were African-American.

For today’s teens, the most famous American in history is…the Rev. Dr. Martin Luther King Jr., appearing on 67 percent of all lists. Rosa Parks was close behind, at 60 percent, and third was Harriet Tubman, at 44 percent. Rounding out the top ten were Susan B. Anthony (34 percent), Benjamin Franklin (29 percent), Amelia Earhart (23 percent), Oprah Winfrey (22 percent), Marilyn Monroe (19 percent), Thomas Edison (18 percent) and Albert Einstein (16 percent). For the record, our sample matched within a few percentage points the demographics of the 2000 U.S. Census: about 70 percent of our respondents were white, 13 percent African-American, 9 percent Hispanic, 7 percent Asian-American, 1 percent Native American.

What about the gap between our supposedly unmoored youth and their historically rooted elders? There was not much of one. Eight of the top ten names were identical. (Instead of Monroe and Einstein, adults listed Betsy Ross and Henry Ford.) Among both kids and adults, neither region nor gender made much difference. Indeed, the only consistent difference was between races, and even there it was only between African-Americans and whites. Whites’ lists comprised four African-Americans and six whites; African-Americans listed nine African-American figures and one white. (The African-American students put down Susan B. Anthony, the adults Benjamin Franklin.)

Trying to take the national pulse by counting names is fraught with problems. To start, we know little about our respondents beyond a few characteristics (gender, race/ethnicity and region, plus year and place of birth for adults). When we tested our questionnaire on kids, we found that replacing “important” with “famous” made little difference, but we used “famous” with adults for the sake of consistency. Prompting for women’s names obviously inflated their total, though we are at a loss to say by how many.

But still: such qualifications cannot mist the clarity of consensus we found among Americans of different ages, regions and races. Eighty-two years after Carter G. Woodson founded Negro History Week, Martin Luther King Jr. has emerged as the most famous American in history. This may come as no surprise—after all, King is the only American whose birthday is celebrated by name as a national holiday. But who would have predicted that Rosa Parks would be the second most named figure? Or that Harriet Tubman would be third for students and ninth for adults? Or that 45 years after the Civil Rights Act was passed, the three most common names appearing on surveys in an all-white classroom in, say, Columbia Falls, Montana, would belong to African-Americans? For many of those students’ grandparents, this moment would have been unimaginable.

In the space of a few decades, African-Americans have moved from blurry figures on the margins of the national narrative to actors on its center stage. Surely multicultural education has played a role. When textbooks of the 1940s and ’50s employed the disingenuous clause “leaving aside the Negro and Indian population” to sketch the national portrait, few cried foul. Not today. Textbooks went from “scarcely mentioning” minorities and women, as a 1995 Smith College study concluded, to “containing a substantial multicultural (and feminist) component” by the mid-1980s. Scanning the shelves of a school library—or even the youth biography section at your local mega-chain bookstore—it’s hard to miss this change. Schools, of course, influence others besides students. Adults learn new history from their children’s homework.

Yet, to claim that the curriculum alone has caused these shifts would be simplistic. It wasn’t librarians, but members of Congress who voted for Rosa Parks’ body to lie in honor in the Capitol Rotunda after she died in 2005, the first woman in American history to be so honored. And it wasn’t teachers, but officials at the United States Postal Service who in 1978 made Harriet Tubman the first African-American woman to be featured on a U.S. postage stamp (and who honored her with a second stamp in 1995). Kids learn about Martin Luther King not only in school assemblies, but also when they buy a Slurpee at 7-Eleven and find free copies of the “I Have a Dream” speech by the cash register.

Harriet Tubman’s prominence on the list was something we wouldn’t have predicted, particularly among adults. By any measure, Tubman was an extraordinary person, ferrying at least 70 slaves out of Maryland and indirectly helping up to 50 more. Still, the Underground Railroad moved 70,000 to 100,000 people out of slavery, and in terms of sheer impact, lesser-known individuals played larger roles—the freeman David Ruggles and his Vigilance Committee of New York, for example, aided a thousand fugitives during the 1830s. The alleged fact that a $40,000 bounty (the equivalent of $2 million today) was offered for her capture is sheer myth, but it has been printed over and over again in state-approved books and school biographies….

It’s much easier to document the accomplishments of the only living person to appear in the top ten list. Oprah Winfrey is not just one of the richest self-made women in America. She is also a magazine publisher, life coach, philanthropist, kingmaker (think Dr. Phil), advocate for survivors of sexual abuse, school benefactor, even spiritual counselor. In a 2005 Beliefnet poll, more than a third of the respondents said she had “a more profound impact” on their spirituality than their pastor.

Some people might point to the inclusion of a TV talk-show host on our list as an indication of decline and imminent fall. I’d say that gauging Winfrey’s influence by calling her a TV host makes as much sense as sizing up Ben Franklin’s by calling him a printer. Consider the parallels: both rose from modest means to become the most identifiable Americans of their time; both became famous for serving up hearty doses of folk wisdom and common sense; both were avid readers and powerful proponents of literacy and both earned countless friends and admirers with their personal charisma.

Recently, the chairman of the National Endowment for the Humanities, Bruce Cole, worried that today’s students don’t learn the kind of history that will give them a common bond. To remedy this, he commissioned laminated posters of 40 famous works of art to hang in every American classroom, including Grant Wood’s 1931 painting “The Midnight Ride of Paul Revere.” “Call them myths if you want,” Cole said, “but unless we have them, we don’t have anything.”

He can relax. Our kids seem to be doing just fine without an emergency transfusion of laminated artwork. Myths inhabit the national consciousness the way gas molecules fill a vacuum. In a country as diverse as ours, we instinctively search for symbols—in children’s biographies, coloring contests, Disney movies—that allow us to rally around common themes and common stories, whether true, embellished or made out of whole cloth.

Perhaps our most famous national hand-wringer was Arthur Schlesinger Jr., whose 1988 Disuniting of America: Reflections on a Multicultural Society predicted our national downfall. “Left unchecked,” he wrote, the “new ethnic gospel” is a recipe for “fragmentation, resegregation and tribalization of American life.”

If, like Schlesinger (who died last year), Monte-Sano and I had focused on statements by the most extreme multiculturalists, we may have come to a similar conclusion. But that’s not what we did. Instead, we gave ordinary kids in ordinary classrooms a simple survey and compared their responses with those from the ordinary adults we found eating lunch in a Seattle pedestrian mall, shopping for crafts at a street fair in Philadelphia or waiting for a bus in Oklahoma City. What we discovered was that Americans of different ages, regions, genders and races congregated with remarkable consistency around the same small set of names. To us, this sounds more like unity than fragmentation.

The common figures who draw together Americans today look somewhat different from those of former eras. While there are still a few inventors, entrepreneurs and entertainers, the others who capture our imagination are those who acted to expand rights, alleviate misery, rectify injustice and promote freedom. That Americans young and old, in locations as distant as Columbia Falls, Montana, and Tallahassee, Florida, listed the same figures seems deeply symbolic of the story we tell ourselves about who we think we are—and perhaps who we, as Americans, aspire to become.

8 Comments

Filed under dilemmas of teaching

Evidence, Beliefs, and a Science of Education

For those of us who like to consider ourselves rational beings, we read nutrition labels on yogurt, buy Consumer Reports for car ratings, search the Internet for explanations of that ache in the upper arm, listen to experts, and then make reasoned judgments about purchases, work, education, health, and safety. We believe in the importance of scientific research to advance knowledge and subsequent technological applications; we believe in collecting and sifting evidence before we decide what to do; we prize being logical, making rational decisions based upon scientific evidence.

Sure we do.

Yet each of us knows that so much of what we do in life is not only a matter of rationality, logic, and evidence but also actions anchored in beliefs, feelings, habits, and instincts. Choosing friends. Picking a college. Deciding on a job. Voting for a president. Getting married. Having a child.

Neither wholly rational nor emotional, our decisions and actions are a combination of both. We prize new knowledge derived from hard and soft sciences and their applications to life insofar as what they can do for us individually and collectively. We listen to experts. Yet every day in so many ways we pursue our beliefs, apply our values, and follow our emotions. Nothing new here except on those occasions when rationality, science, and emotions light up policy issues that touch our daily work and life.

Consider creationism and whether it should be taught in schools alongside Darwinian evolution. Or global warming where lightning and thunder generated by dueling scientific studies have paralyzed governmental policymaking as tornadoes, hurricanes, glacial melt and rising coastal waters make headlines every month. Partisan debates over restricting or expanding stem-cell research continue. Often science and beliefs clash. Ideologies–emotionally and value-driven ideas–arouse passions and dominate policy debates and decision-making.

Similar clashes over science and beliefs have occupied educational researchers and practitioners.  Ideological struggles over the purposes of tax-supported schooling and which pedagogies should dominate classrooms (see: Pathways to Reform-Start With Values) have fueled public debates and influenced policy decisions for nearly two centuries. Efforts to make education into a science, first begun during the Progressive era in the early 20th century remain a live issue today even after the U.S. Department of Education renamed its research arm as the Institute of Education Sciences in 2002.

New names, however, seldom quell doubts.  Just as economists have asked of themselves whether the discipline is scientific, so have educational researchers (RESEARCH-2002-Berliner). What makes it especially difficult for educational researchers and practitioners to view education as a science are the constant disputes over diverse research results particularly when results have little to do with what practitioners face daily.

That practitioners both relish and reject research studies (see doyle, practicality ethic) is not news. The common teacher skepticism of research is rational in that so many studies answer questions that teachers seldom ask about their students or classroom practices. But it is also emotional in that teachers highly value, even prize, their accumulated experience and often choose those practical experiences over studies that suggest certain of their practices fail to help students.

How far should one go in being skeptical of  educational research and the degree to which it is scientific?

I value research. I have asked questions investigating the history of teaching, curriculum, school reform, and technology. I have designed studies, and, using different methodologies, collected evidence and published my findings. I know that truth is elusive and that biases including mine–another way of admitting emotions enter into rational decision-making–can slant even the best designed study. Still, a careful, rigorous, and honest search for truth is essential, I believe, for improved teaching and student learning.

Then the practitioner part of me kicks in and says that so much educational research fails to ask, much less answer, puzzling questions that teachers, principals, and superintendents face daily. Instead,to get answers to these questions, hardworking professionals have to rely on their experiences and that of peers, as I had done.

I have worked in both worlds and find it tempting to agree with those studies that support my biases while rejecting those that challenge those very same biases. And when research findings are mixed, I am tempted to ignore the findings. So I am torn by conflicting evidence and emotions. In truth, what I often end up doing–the compromise I have worked out–is to rely upon my experiences in classrooms and schools while keeping an eye peeled for rigorous, high-caliber studies.

Do I believe that there is a science of education and schooling? No. The contexts, the value-driven purposes, and the emotional life of classroom interactions make schooling different from physics and biology.   But I do believe that a frank awareness that both rational and emotional factors come into play in making policy, putting policy into practice, and doing research can help us figure out which scientific studies are applicable to teaching and learning.

11 Comments

Filed under Reforming schools

More Data Needed for Hard-To-Measure Student Learning and Teacher Quality (Guest blogger Stephen Lane)

Cover of "The Quants: How a New Breed of ...

Cover via Amazon

Stephen Lane is a high school teacher with 10 years’ experience at a suburban high school outside of Boston, MA. He teaches history and economics and also coaches cross-country and track & field.

We are living in an age of the quants. In the social sciences, sports, and of course education, the number-crunchers rule. Information that can’t be reduced to a numerical essence is suspect, whether the subject is basketball or economics. In education, quantifiable data can be useful, but the current infatuation with numbers betrays muddled thinking, misallocates teaching resources, and rewards behavior we probably don’t want to see rewarded in teaching.

Now that I’m done preaching to the choir, I’d like to think about how to reverse this trend. Railing against the misapplication of numbers (in particular, standardized test results, and misguided comparisons of American students to their foreign counterparts) is important, but we also ought marshal the tools of the quants themselves – data – to support a more nuanced look at student achievement and quality teaching.

My aunt, a former school committee chair (don’t hold it against her), and uncle, a retired principal, both avow they knew good teaching after 5 minutes of classroom observation. A former student of mine, whom we’ll call Hermione (the brightest witch of her age), is an ed consultant on value-added models (again, don’t judge). She feels confident their models can measure a teacher’s impact – given 10 years of data in relatively unchanging circumstances. Neither a 5-minute observation nor 10 years of data seem like reasonable metrics, nor do they address the conditions underlying the performance.

Quants assume that teachers work primarily in a vacuum – that a teacher’s success or failure is primarily the product of individual ability, effort, or lack thereof. My experience differs. My growth as a teacher is primarily due to the culture within my department, and the guidance and example of my colleagues. I am not unusual: Interviews conducted with active and retired teachers in our district with at least 20 years of experience overwhelmingly identified fellow teachers as the primary influence on teachers’ development and improvement.

In a paper on compensation methods in manufacturing, economists Susan Helper, Morris Kleiner, and Yingchun Wang noted that incentivizing certain individual outcomes (e.g., production rate) tends to crowd out other worker activities, such as process and quality improvement, which require greater teamwork.[i]

Professor Helper was kind enough to respond to an email about her work, and noted:

“Individual incentive pay is … problematic when individual contributions to performance are hard to measure… this is especially the case when good performance is hard to define and/or has many competing objectives as is often the case in a lot of modern activities, teaching included.  In these cases, group incentives, or even NO incentive (hourly pay) is better than individual incentives, because otherwise it’s hard to get individuals to engage in hard-to-measure activity (mentoring co-workers…), if the reward is for an easy to measure task that may be less important…”

Standardized test results are an easy measure of student achievement and teacher performance. But the trade-offs are unfortunate. Teachers are people too; people respond to incentives. If standardized tests are used to measure performance, teachers will respond accordingly: More test prep, less collegial collaboration. Over-quantification not only assumes teaching in a vacuum, it makes it more likely.

Further, teachers will set up incentives for students so that test prep will crowd out activities which hone hard-to-measure skills such as problem solving, creativity, decision-making, and teamwork. Over-quantification not only assumes that success is about attaining a certain score, it increases the likelihood that attainment of that score will be the only success.

Again, not a new argument. But I’d like to repackage the argument in the language of the quants. Can hard data make a case for the nuanced view of teacher and student success? Can it shape a message that breaks through the noise, reverses current trends, and changes the discussion? Let me ask several questions:

1 – What kinds of teamwork are significant and measurable in the teaching profession? Can we measure the degree to which teachers collaborate positively in a school?

2 – Is it possible to draw correlations between degree of teacher collaboration and student achievement? How exactly does teacher teamwork improve student performance?

3 – How to measure student achievement on hard-to-measure activities?

Producing data that helps teachers, policymakers, and the public understand what constitutes good teaching and real student achievement is a start towards knocking down the edifice the quants have built. Their numbers are simple because they ignore complexity. If we can’t make the complexity understandable, the quants’ simplistic view of education will reign unchallenged.

Granted, larger sociopolitical trends are at work, but trends are trends. They reverse. The tide will turn eventually, but we can do more to help the process along. What kind of data can be gathered? And how can we shape it into a more compelling message?


[i] Helper, Kleiner, Wang, Analyzing Compensation Methods in Manufacturing: Piece Rates, Time Rates, or Gain-Sharing. 2010, NBER Working Paper Series

8 Comments

Filed under how teachers teach

Paying Doctors on the Basis of Patient Outcomes

A doctor described to a colleague a long-term patient who he had seen earlier in the week. She has diabetes that can be controlled but has failed to come into his office regularly even though he has contacted her many times. The doctor is highly ranked on quality measures that the local health insurer has laid out for evaluating and paying physicians to improve medical care and cut costs. Yet the doctor asked this colleague about his infrequently seen diabetic patient: “She just can’t afford to take that much time off from work. Does that make me a worse doctor?”

In pay-4-performance plans established by Medicare and private health insurers, the basic assumption is that, yes, what a doctor does has a direct effect on patients’ health. But what patients do or do not do is not part of that assumption. For example, many of the measures used to rank and reward doctors come from guidelines drawn from extensive research studies that are designated as “best” practices such as patients getting regular mammograms, Pap smears, screening for high cholesterol, diabetes, high blood pressure, and colon cancer. Health insurers want physicians to use these evidenced-based practices to standardize delivery of quality care, improve health of patients, and reduce medical costs. Health insurers, at the same time, also evaluate and pay doctors on the frequency in use of these evidenced-based practices. Billions of public and private dollars are now invested in evaluating doctors’ performance and paying them. Like so many businesses, the belief that were individual incentives distributed on the basis of performance employees and professionals will work harder and do the right things for their customers and clients.

For those familiar with the trajectory of well-intentioned policies aimed at changing individual and institutional behaviors, unintended consequences occur as predictably as windy days in Chicago. Policymakers are stuck, however. They just don’t know which unexpected consequences to anticipate. Although when it comes to using money as an incentive to change behavior, much literature exists on what happens when results for hospitals, surgeons, and medical procedures are reported publicly. Some astute decision-makers might have figured out ways that any such policy could be gamed by individuals and institutions (see WillP4PandQualityReportingAffectHealthDisparities).

No surprise, then, that unexpected consequences have stuck thumbs in the eyes of policymakers and insurers on pay-4-performance plans. For example, Medicare requires—as a quality measure—that doctors administer antibiotics to a pneumonia patient within six hours of arriving at the hospital. As one physician said: “The trouble is that doctors often cannot diagnose pneumonia that quickly. You have to talk to and examine the patient and wait for blood tests, chest X-rays and so on.” What’s worse, he continues, is that “more and more antibiotics are being used in emergency rooms today, despite all-too-evident dangers like antibiotic-resistant bacteria and antibiotic-associated infections.” He and other doctors know that surgeons have been known to cherry pick reasonably healthy patients for heart bypass operations and ignore elderly ones who have 3-5 chronic ailments to insure that results look good.

Also medical researchers know far more about the effects of low-income, under-insured, and non-English speaking patients when doctors are ranked on the quality of care they render especially if rewards or penalties follow such rankings. One study involving 125,000 patients revealed that who doctors cared for affected their rankings for pay-4-performance plans. Those doctors who cared for older or sicker patients were ranked higher (probably because of frequent follow-up, behaviors that received higher rankings) than those doctors who cared for minority and under-insured patients who saw doctors irregularly. Which patients are cared for, then, affects rankings.

In light of emerging evidence that untoward outcomes occur when cash incentives are put into place to evaluate, rank, and reward individual doctors, pay-4-performance schemes have raised disturbing questions not only about the basic assumption that what doctors do determine effects upon patients but also about the assumptions driving pay-4-performance plans for individual teachers based upon student test scores.

Although both doctors and teachers are in helping professions, many differences separate them from one another in their training, daily work, the scientific basis for what they do, societal respect, and accountability. Nonetheless, there are some striking similarities insofar as evaluation and pay-4-performance policies. The next post takes up these similarities.

11 Comments

Filed under comparing medicine and education