Category Archives: testing

“How The Other Half Learns”: A Review (Part 1)

Robert Pondiscio’s recent book about a New York City elementary school is an uncommon example of research and writing on school reform.

Why uncommon?

Few former teachers, journalists, and academic researchers have done what he did in spending a year at Bronx 1, part of the network of Success Academies in New York City that former city official Eva Moskowitz founded in 2006. Now a charter network of 47 schools in New York City enrolling 17,000 low-income children of color, Success Academies are both extolled and criticized (especially in the media). There is precious little middle ground when it comes to reformers, parents, teachers, and others when it comes to judging the network’s worth. Writing in 2014, a few years before Pondiscio became embedded in one Success Academy school, he wrote an op-ed in a New York newspaper asking: “Is Eva Moskowitz the Michael Jordan of Education Reform, or is she the Mark McGuire? (p. 10)”. One athlete, the finest of all basketball players in the 20th century and the other a disgraced steroid-filled home run hitter. He wasn’t sure. But two years later he wanted to find out.

Surprisingly, Moskowitz agreed to let Pondiscio to spend a year at Bronx 1 observing classes and teacher meetings, shadowing the principal and staff members, interviewing parents, teachers and administrators, meeting with children–“scholars” as they are called–in and out of school. He also attended teacher training sessions and staff development workshops. That Pondiscio was a former teacher in a Bronx low-income, low-scoring elementary school and affiliated with the Thomas B. Fordham Institute, an organization that boosts charter schools and whose leadership had praised her work in New York City may have helped in making her decision.

Which brings me to another reason for the book’s uniqueness. Except for Jay Mathews book on the Knowledge Is Power Program (KIPP) a decade earlier (Work Hard. Be Nice.: How Two Inspired Teachers Created the Most Promising Schools in America, Pondiscio is the only journalist and former teacher to examine in depth a charter network that prides itself on high test scores year after year outscoring affluent elementary schools in the state—a focus that drives anti-charter critics and student-centered enthusiasts to sucking their thumbs.

In the past 35 years in U.S. schools the unfurled umbrella of business-driven standards-based reform, testing and scores have been the center of attention in improving urban schools. The resulting literature on school reform has been dominated by pundits, policymakers, and passers-by who equate high test scores with teacher and school effectiveness. Test score gains have become the coin of the realm and schools blessed with those higher scores have become darlings of donors and policymakers. Success Academies have produced annual gains even outscoring those New York City schools with low poverty rates and mostly white enrollments.

This book, then, is an anomaly in its in-depth portrayal of children, staff, parents, and leadership in a neighborhood that by all accounts–including the nearby one in which Pondiscio once taught–should have been overwhelmed by its surroundings. Yet Bronx 1 was not. It excelled insofar as test scores. Although Pondiscio makes clear that test scores are a narrow measure of student learning, he gives readers his take on why the school did excel.

In praising the uncommonness of this book, I have not forgotten academic researchers who have gone into schools. They (including doctoral students) have indeed spent time in schools using both quantitative and qualitative methods to paint pictures of schools and classrooms at one point in time. And they have written about their before-and-after studies of experimental programs in schools and deeply detailed case studies–all using the outcome measure of standardized test scores.

Few researchers, however, who have written about schools and districts–the basic units of reform–have taught in similar settings and have actually spent at least a year in those classrooms, schools and districts observing, interviewing, and capturing incidents and details that make school cultures in those places dance before a reader’s eyes.* To do so takes experience of teaching in such schools and the skills of a writer to pull out the significant incidents and flesh out the main players in words that grab attention and stay fixed in the readers’ mind. Not an academic researcher but ex-teacher and journalist, Pondisco does exactly that. **

Still I wanted to see if other academic researchers, journalists, or others had written comparable volumes that would make How the Other Half Learns part of a tradition rather than being uncommon. So before I sat down at my computer to write this review I went through my library and pulled out the books that did what no pundit, policymaker, or passer-by could do. I do not claim that my library covers the entire literature of school reform in districts and schools yet in my selective collection of books I found a handful that met the above criteria of skillful writing, spending a year or so in the setting, and getting the story published. None of the writers, however, had teaching experience. Sure there are other studies that I missed. So be it. But these, I believe, are comparable to Pondiscio’s book.

#Former Washington Post journalist and later a university sociologist, Gerald Grant spent a year at a Syracuse High School working with teachers and students recording daily activities, interviewing teachers and students, and observing classrooms and meetings. He places the high school in a historical context, that is, going from a mostly white, privileged enrollment to a desegregated one with a substantial minority presence. The turmoil of the Vietnam War, court decisions expanding students’ rights and ending, shifted authority and the ways that students and teachers interacted. The World We Created at Hamilton High School was published in 1988.

#Linda Perlstein, another journalist at the Washington Post, published Tested: One American School Struggles To Make the Grade in 2007. She writes of her year spent at Tyler Heights Elementary School in Annapolis, (MD). Mostly African American children from low-income homes, the school scored well on state tests and had earned a reputation for academic excellence. Perlstein describes the principal, teachers, and students over the course of a year.

#Jay Mathews, longtime Washington Post columnist on education wrote about the two teacher founders of the KIPP schools in 2009 (see above). He observed classes, spent time in schools, and revealed to a general audience what he saw thereby challenging the prevailing myths that surrounded these particular charter schools.

#Finally there is a high school in San Francisco that let a journalist spend four years–yes, four years–to observe classes, interview staff and students, and meet with them inside and outside school. Kristina Rizga’s Mission High School (2015) unravels the puzzle of a high school with low test scores year after year and yet over four of five graduates get admitted to college. Challenging the existing concentration on test scores as a proper measure of school and student achievement, Rizga’s analysis provides answers to this disparity between low test scores and college going graduates.

These books are the ones that I found in my library. Readers might supply their own examples of embedded journalists and researchers, some with teaching experiences and some not, who have spent considerable time in classrooms, and with students and parents. Each of these books, as Pondiscio’s, has embedded the all-important contexts–local, state, and national–into their accounts. Even were my list incomplete, when one counts up such rich examinations of schools, they are but a thimbleful of the literature on urban school reform.

Parts 2 and 3 dig into How the Other Half Learns raising questions about the tilt that the author has toward parental choice for low-income and working class minority parents, a single curriculum for all students, and a culture that makes extraordinary demands upon both parents and children.

___________________________________

*One example (there are probably others) is researcher Louis Smith at Washington University in St. Louis who teamed up with William Geoffrey, a seventh grade teacher in a local school. Smith the outside observer recorded what happened every day for a semester in Geoffrey’s classroom. This micro-ethnography was published in 1968 as Complexities of an Urban Classroom: An Analysis toward a General Theory of Teaching.

**In this review, I omit first-hand accounts by teachers (e.g., Dangerous Minds, Freedom Writers) and principals (e.g., Lean on Me) because they narrowly describe one classroom or one school from only the teacher’s or administrator’s view. Observing and interviewing many teachers reveals the variation that exists in a school. Awareness of the school and district detailed contexts rarely appears in such books.

3 Comments

Filed under how teachers teach, leadership, school reform policies, testing

The New Stupid Replaces the Old Stupid (Rick Hess)

From an interview conducted in 2009 with Rick Hess, then Resident Scholar at The American Enterprise Institute. I have lightly abridged the interview. The original article upon which this interview is based is here.

Q: Rick, you recently published an article in Educational Leadership
arguing that the ways in which we rely on data to drive decisions in
schools has changed over time. Yet, you note that we have unfortunately only
succeeded in moving from the “old stupid” to the “new stupid.” What do you do
you mean by this?

A: A decade ago, it was only too easy to find education leaders who dismissed
student achievement data and systematic research as having only limited utility
when it came to improving schools. Today, we’ve come full circle. You can’t
spend a day at an education gathering without hearing excited claims about
“data-based decision making” and “research-based practice.” Yet these phrases
can too readily serve as convenient buzzwords that obscure more than they
clarify and that stand in for careful thought. There is too often an unfortunate
tendency to simply embrace glib solutions if they’re packaged as “data-driven.”
Today’s enthusiastic embrace of data has waltzed us directly from a petulant
resistance to performance measures to a reflexive reliance on a few simple
metrics–namely, graduation rates, expenditures, and grade three through eight
reading and math scores. The result has been a race from one troubling mindset
to another–from the “old stupid” to the “new stupid.”

Q: Can you give us an example of the “new stupid”?

A: Sure, here’s one. I was giving a presentation to a group of aspiring
superintendents. They were eager to make data-driven decisions and employ
research to serve kids. There wasn’t a shred of the old stupid in sight. I
started to grow concerned, however, when our conversation turned to value-added
assessment and teacher assignments. The group had recently read a research brief
highlighting the effect of teachers on achievement and the inequitable
distribution of teachers within districts. They were fired up and ready to put
this knowledge to use. One declared to me, to widespread agreement, “Day one,
we’re going to start identifying those high value-added teachers and moving them
to the schools that aren’t making AYP.” [AYP is the acronym from No Child Left Behind law (2002-20015); it means “Adequate Yearly Progress” in test scores for different groups of students.]

Now, I sympathize with the premise, but the certainty worried me. I started
to ask questions: Can we be confident that teachers who are effective in their
current classrooms would be equally effective elsewhere? What effect would
shifting teachers to different schools have on the likelihood that teachers
would remain in the district? Are the measures in question good proxies for
teacher quality? My concern was not that they lacked firm answers to these
questions–that’s natural enough even for veteran superintendents–it was that
they seemingly regarded such questions as distractions.

Q: What’s a concrete example of where educators and advocates
overenthusiastically used data to tout a policy, but where the results didn’t
pan out? What went wrong?

A: Take the case of class-size reduction. For two decades, advocates of
smaller classes have referenced the findings from the Student Teacher
Achievement Ratio (STAR) project, a class-size experiment conducted in Tennessee
in the late 1980s. Researchers found significant achievement gains for students
in small kindergarten classes and additional gains in first grade. The results
were famously embraced in California, which in 1996 adopted a program to reduce
class sizes that cost nearly $800 million in its first year. But the dollars
ultimately yielded disappointing results, with the only major evaluation–by AIR
and RAND–finding no effect on achievement.

What happened? Policymakers ignored nuance and context. California encouraged
districts to place students in classes of no more than 20–but that class size
was substantially larger than those for which STAR found benefits. Moreover,
STAR was a pilot program serving a limited population, which minimized the need
for new teachers. California’s statewide effort created a voracious appetite for
new educators, diluting teacher quality and encouraging well-off districts to
strip-mine teachers from less affluent communities. The moral is that even
policies or practices informed by rigorous research can prove ineffective if the
translation is clumsy or ill considered….

Q: In your mind, what are some of the main limitations of research as
they apply to schooling?

A: First, let me be clear: Good research has an enormous contribution to
make–but, when it comes to policy, this contribution is more tentative than we
might prefer. Scholarship’s greatest value is not the ability to end policy
disputes, but to encourage more thoughtful and disciplined debate.

In particular, rigorous research can establish parameters as to how big an
effect a policy or program might have, even if it fails to conclusively answer
whether it “works.” For instance, quality research has quieted assertions that
national-board-certified teachers are likely to have heroic impacts on student
achievement or that Teach For America recruits might adversely affect their
students.

Especially when crafting policy, we should not expect research to dictate
outcomes but should instead ensure that decisions are informed by the facts and
insights that science can provide. Education leaders should not expect research
to ultimately resolve thorny policy disputes over school choice or teacher pay
any more than medical research has ended contentious debates over health
insurance or tort reform….

Q: What do you see as the main motivation behind the “new stupid”? Is
it simply an example of good intentions gone awry?

A: In a word: yes. It’s a strategy pursued with the best of intentions. But
the problem is threefold. First, as we’ve discussed, too many times those of us
in K-12 are unsophisticated about what a particular study or a particular data
set can tell us. Second, the very passion that infuses the K-12 sector creates a
sense of urgency. People want to fix problems now, using whatever tools are at
hand–and don’t always stop to realize when they’re trying to fix a Swiss watch
with a sledgehammer. Third, the reality is that we still don’t have the kinds of
data and research that we need. So, too often, the choice is to misapply extant
data or simply go data-free. Everyone involved means well; the trick is provide
the right training, the right data, and for practitioners, policymakers, and
reformers to ensure that compassion doesn’t swamp common sense.

 

5 Comments

Filed under Reforming schools, school reform policies, testing

Can Superintendents Raise Test Scores?

I first asked this question in a post published over six years ago. I have updated and revised that post because the answer is popularly and resoundingly “yes” although the evidence is squirmy.  I revisit both the question and answer.

 

After Atlanta (GA) school administrators and teachers went to trial and were convicted and sentenced to jail for cheating and before that the El Paso (TX) superintendent convicted of the same charge and in prison, the generally accepted idea that district superintendents can pump up student  achievement has taken a serious hit. Cheating scandals across the country have turned the belief in superintendents raising test scores into something tawdry.

For decades, many superintendents have been touted as earnest instructional leaders, expert managers, and superb politicians who can mobilize communities and teacher corps to improve schools and show gains in students’ test scores. From Arlene Ackerman  in Philadelphia to Joel Klein in New York City to Kaya Henderson in Washington, D.C., big city superintendents are at the top rung of those who can turn around failing districts.

Surely the Atlanta cheating scandal and others around the country have tarnished the image of dynamic superintendents taking urban schools from being in dumpsters to $1 million Broad Prize winners. A tainted image, however, will not weaken the Velcro belief that smart district superintendents will lead districts to higher student achievement. Just look at contracts that school boards and mayors sign with new superintendents. Contract clauses call for student test scores, graduation rates, and other academic measures to increase during the school chief’s tenure (see here and here).

Then along comes a study that asks whether superintendents are “vital or irrelevant.” Drawing on state student achievement data from North Carolina and Florida for the years 1998-2009, researchers sought to find out how much of a relationship existed between the arrival of new superintendents, how long they served, and student achievement in districts (see PDF SuperintendentsBrown Center9314 ).

Here is what the researchers found:

  1. School district superintendent is largely a short-term job. The typical superintendent has been in the job for three to four years.
  2. Student achievement does not improve with longevity of superintendent service within their districts.
  3. Hiring a new superintendent is not associated with higher student achievement.
  4. Superintendents account for a small fraction of a percent (0.3 percent) of student differences in achievement. This effect, while statistically significant, is orders of magnitude smaller than that associated with any other major component of the education system, including: measured and unmeasured student characteristics; teachers; schools; and districts.
  5. Individual superintendents who have an exceptional impact on student achievement cannot be reliably identified.

Results, of course, are from only one study and must be handled with care. The familiar cautions about the limits of the data and methodology are there. What is remarkable, however, is that the iron-clad belief that superintendents make a difference in student outcomes held by the American Association of School Administrators, school boards, and superintendents themselves has seldom undergone careful scrutiny. Yes, the above study is correlational. It does not get into the black box of exactly how and what superintendents do improves student achievement.

Ask superintendents how they get scores or graduation rates to go up.  The question is often answered with a wink or a shrug of the shoulders. Among most researchers and administrators who write and grapple with this question of whether superintendents can improve test scores, there is no explicit model of effectiveness. That is correct, there is no theory of change, no theory of action.

How exactly does a school chief who is completely dependent on an elected school board, district office staff, a cadre of principals whom he or she may see monthly, and teachers who shut their doors once class begins–raise test scores, decrease dropouts, and increase college attendance? Without some theory by which a superintendent can be shown to have causal effects, test scores going up or down remain a mystery or a matter of luck that the results occurred during that school chief’s tenure (I exclude cheating episodes where superintendents have been directly involved because they have been rare).

Many school chiefs, of course, believe–a belief is a covert theory–that they can improve student achievement. They hold dear the Rambo model of superintending. Strong leader + clear reform plan + swift reorganization + urgent mandates + crisp incentives and penalties =  desired student outcomes. Think former New York City Chancellor Joel Klein, ex-Miami-Dade Superintendent Rudy Crew, ex-Chancellor of Washington D.C.and ex-school chief Alan Bersin in San Diego. Don’t forget John Deasy in Los Angeles Unified School District. And now, Pedro Martinez in San Antonio Independent School District

There are, of course, other less heroic models or theories of action that mirror more accurately the complex, entangled world of moving school board policy to classroom practice. One model, for example, depicts stable, ongoing, indirect influence where superintendents slowly shape a district culture of improvement, work on curriculum and instruction, insure that  principals run schools consistent with district goals, support and prod teachers to take on new classroom challenges, and communicate often with parents about what’s happening. Think ex-superintendents Carl Cohn in Long Beach (CA), Tom Payzant in Boston (MA) and Laura Schwalm in Garden Grove (CA). Such an indirect approach is less heroic, takes a decade or more, and ratchets down the expectation that superintendents be Supermen or Wonder Women.

Whether school chiefs or their boards have a Rambo model, one of indirect influences, or other models, some theory exists to explain how they go about improving student performance. Without some compelling explanation for how they influence district office administrators, principals, teachers, and students to perform better than they have, most school chiefs have to figure out their own personal cause-effect model, rely upon chance, or even in those rare occasions, cheat.

What is needed is a crisp GPS navigation system imprinted in school board members’ and superintendents’ heads that contain the following:

*A map of the political, managerial, and instructional roles superintendents perform, public schools’ competing purposes, and the constant political responsiveness of school boards to constituencies that inevitably create persistent conflicts.

*a clear cause-effect model of how superintendents directly influence principals and teachers and they, in turn,influence students to do better as in creating incentives and sanctions, a culture of trust that encourages both risk-taking and willingness to learn.

*a practical and public definition of what constitutes success for school boards, superintendents, principals,teachers, and students beyond standardized test scores, higher graduation rates, and college admissions.

Such a navigation system and map are steps in the right direction of answering the question of whether superintendents can raise test scores.

3 Comments

Filed under leadership, testing

What Makes a Great School? (Jack Schneider)

Jack Schneider is an Assistant Professor of Education at the University of Massachusetts, Lowell. He is  “a historian and policy analyst who studies the influence of politics, rhetoric, culture, and information in shaping attitudes and behaviors. His research examines how educators, policymakers, and the public develop particular views about what is true, what is effective, and what is important. Drawing on a diverse mix of methodological approaches, he has written about measurement and accountability, segregation and school choice, teacher preparation and pedagogy, and the relationship between research and practice. His current work, on how school quality is conceptualized and quantified, has been supported by the Spencer Foundation and the Massachusetts State Legislature.

The author of three books, Schneider is a regular contributor to “The Washington Post” and “The Atlantic” and co-hosts the education policy podcast “Have You Heard.” He also serves as the Director of Research for the Massachusetts Consortium for Innovative Education Assessment.”

This piece appeared October 23, 2017.

 

What are the signs that a school is succeeding?

Try asking someone. Chances are, they’ll say something about the impact a school makes on the young people who attend it. Do students feel safe and cared for? Are they being challenged? Do they have opportunities to play and create? Are they happy?

If you’re a parent, getting this kind of information entails a great deal of effort — walking the hallways, looking in on classrooms, talking with teachers and students, chatting with parents, and watching kids interact on the playground.

Since most of us don’t have the time or the wherewithal to run our own school-quality reconnaissance missions, we rely on rumor and anecdote, hunches and heuristics, and, increasingly, the Internet.

So what’s out there on the web? Are our pressing questions about schools being answered by crowdsourced knowledge and big data sets?

As it turns out, no.

There’s information, certainly. But mostly it doesn’t align with what we really want to know about how schools are doing. Instead, most of what we learn about schools online — on the websites of magazines, on school rating sites, and even on real estate listings — comes from student standardized test scores. Some may include demographic information or class size ratios. But the ratings are derived primarily from state-mandated high stakes tests.

The first problem with this state of affairs is that test scores don’t tell us a tremendous amount about what students are learning in school. As research has demonstrated, school factors explain only about 20 percent of achievement scores — about one-third of what student and family background characteristics explain. Consequently, test scores often indicate much more about demography than about schools.

Even if scores did reflect what students were learning in school, they’d still fail to address the full range of what schools actually do. Multiple-choice tests communicate nothing about school climate, student engagement, the development of citizenship skills, student social and emotional health, or critical thinking. School quality is multidimensional. And just because a school is strong in one area does not mean that it is equally strong in another. In fact, my research team has found that high standardized test score growth can be correlated with low levels of student engagement. Standardized tests, in short, tell us very little about what we actually value in schools.

One consequence of such limited and distorting data is an impoverished public conversation about school quality. We talk about schools as if they are uniformly good or bad, as if we have complete knowledge of them, and as if there is agreement about the practices and outcomes of most value.

Another consequence is that we can make unenlightened decisions about where to live and send our children to school. Schools with more affluent student bodies tend to produce high test scores. Perceived

as “good,” they become the objects of desire for well-resourced and quality-conscious parents. Conversely, schools with more diverse student bodies are dismissed as bad.

GreatSchools.org gives my daughter’s school — a highly diverse K–8 school — a 6 on its 10-point scale. The state of Massachusetts labels it a “Level 2” school in its five-tier test score-based accountability system. SchoolDigger.com rates it 456th out of 927 Massachusetts elementary schools.

How does that align with reality? My daughter is excited to go to school each day and is strongly attached to her current and former teachers. A second-grader, she reads a book a week, loves math, and increasingly self-identifies as an artist and a scientist. She trusts her classmates and hugs her principal when she sees him. She is often breathlessly excited about gym. None of this is currently measured by those purporting to gauge school quality.

Of course, I’m a professor of education and my wife is a teacher. Our daughter is predisposed to like school. So what might be said objectively about the school as a whole? Over the past two years, suspensions have declined to one-fifth of the previous figure, thanks in part to a restorative justice program and an emphasis on positive school culture. The school has adopted a mindfulness program that helps students cope with stress and develop the skill of self-reflection. A new maker space is being used to bring hands-on science, technology, engineering, and math into classrooms. The school’s drama club, offered free after school twice a week, now has almost 100 students involved.

The inventory of achievements that don’t count is almost too long to list.

So if the information we want about schools is too hard to get, and the information we have is often misleading, what’s a parent to do?

Four years ago, my research team set out to build a more holistic measure of school quality. Beginning first in the city of Somerville, Massachusetts, and then expanding to become a statewide initiative — the Massachusetts Consortium for Innovative Education Assessment — we asked stakeholders what they actually care about in K–12 education. The result is a clear, organized, and comprehensive framework for school quality that establishes common ground for richer discussions and recognizes the multi-dimensionality of schools.

Only after establishing shared values did we seek out measurement tools. Our aim, after all, was to begin measuring what we value, rather than to place new values on what is already measured.

For some components of the framework, we turned to districts, which often gather much more information than ends up being reported. For many other components, we employed carefully designed surveys of students and teachers — the people who know schools best. And though we currently include test score growth, we are moving away from multiple-choice tests and toward curriculum-embedded performance assessments designed and rated by educators rather than by machines.

Better measures aren’t a panacea. Segregation by race and income continues to menace our public schools, as does inequitable allocation of resources. More accurate and comprehensive data systems won’t wash those afflictions away. But so much might be accomplished if we had a shared understanding of what we want our schools to do, clear and common language for articulating our aims, and more honest metrics for tracking our progress.

 

2 Comments

Filed under Reforming schools, school leaders, testing

Principals And Test Scores

I read a recent blog from two researchers who assert that principals can improve students’ test scores. The researchers cite studies that support their claim (see below). These researchers received a large grant from the Wallace Foundation to alter their principal preparation program to turn out principals who can, indeed, raise students’ academic achievement.

I was intrigued by this post because as a district superintendent I believed the same thing and urged the 35 elementary and secondary principals I supervised—we met face-to-face twice a year to go over their annual goals and outcomes and I spent a morning or afternoon at the school at least once a year—to be instructional leaders and thereby raise test scores. Over the course of seven years, however, I saw how complex the process of leading a school is, the variation in principals’ performance, and the multiple roles that principals play in his or her school to engineer gains on state tests (see here and here). And I began to see clearly what a principal can and cannot do. Those memories came back to me as I read this post.

First the key parts of the post:

A commonly cited statistic in education leadership circles is that 25 percent of a school’s impact on student achievement can be explained by the principal, which is encouraging for those of us who work in principal preparation, and intuitive to the many educators who’ve experienced the power of an effective leader. It lacks nuance, however, and has gotten us thinking about the state of education-leadership research—what do we know with confidence, what do we have good intuitions (but insufficient evidence) about, and what are we completely in the dark on? ….

Quantifying a school leader’s impact is analytically challenging. How should principal effects be separated from teacher effects, for instance? Some teachers are high-performing, regardless of who leads their school, but effective principals hire the right people into the right grade levels and offer them the right supports to propel them to success.

Another issue relates to timing: Is the impact of great principals observed right away, or does it take several years for principals to grapple with the legacy they’ve inherited—the teaching faculty, the school facilities, the curriculum and textbooks, historical budget priorities, and so on. Furthermore, what’s the right comparison group to determine a principal’s unique impact? It seems crucial to account for differences in school and neighborhood environments—such as by comparing different principals who led the same school at different time points—but if there hasn’t been principal turnover in a long time, and there aren’t similar schools against which to make a comparison, this approach hits a wall.

Grissom, Kalogrides, and Loeb carefully document the trade-offs inherent in the many approaches to calculating a principal’s impact, concluding that the window of potential effect sizes ranges from .03 to .18 standard deviations. That work mirrors the conclusions of Branch, Hanushek, and Rivkin, who estimate that principal impacts range from .05 to .21 standard deviations (in other words, four to 16 percentile points in student achievement).

Our best estimates of principal impacts, therefore, are either really small or really large, depending on the model chosen. The takeaway? Yes, principals matter—but we still have a long way to go to before we can confidently quantify just how much.

I thoroughly agree with the researchers’ last sentence. But I did have problems with these assertions supported by two studies they listed.

*That principals are responsible for 25 percent of student gains on test scores (teachers, the report account for an additional 33 percent of those higher test scores). I traced back the source they cited and found these statements:

A 2009 study by New Leaders for New Schools found that more than half of a school’s impact on student gains can be attributed to both principal and teacher effectiveness – with principals accounting for 25 percent and teachers 33 percent of the effect.

The report noted that schools making significant progress are often led by a principal whose role has been radically re-imagined. Not only is the principal attuned to classroom learning, but he or she is also able to create a climate of hard work and success while managing the vital human-capital pipeline.

These researchers do cite studies that support their points about principals and student achievement but cannot find the exact study that found the 25 percent that principals account for in student test scores. Moreover, they omit  studies that show  higher education programs preparing principals who have made a difference in their graduates raising student test scores (see here).

I applaud these researchers on their efforts to improve the university training that principals receive but there is a huge “black box” of unknowns that explain how principals can account for improved student achievement. Opening that “black box” has been attempted in various studies that Jane David and I looked at a few years ago in Cutting through the Hype

The research we reviewed on stable gains in test scores across many different approaches to school improvement all clearly points to the principal as the catalyst for instructional improvement. But being a catalyst does not identify which specific actions influence what teachers do or translate into improvements in teaching and student achievement.

Researchers find that what matters most is the context or climate in which the actions occurs. For example, classroom visits, often called “walk-throughs,” are a popular vehicle for principals to observe what teachers are doing. Principals might walk into classrooms with a required checklist designed by the district and check off items, an approach likely to misfire. Or the principal might have a short list of expected classroom practices created or adopted in collaboration with teachers in the context of specific school goals for achievement. The latter signals a context characterized by collaboration and trust within which an action by the principal is more likely to be influential than in a context of mistrust and fear.

So research does not point to specific sure-fire actions that instructional leaders can take to change teacher behavior and student learning. Instead, what’s clear from studies of schools that do improve is that a cluster of factors account for the change.

Over the past forty years, factors associated with raising a school’s academic profile include: teachers’ consistent focus on academic standards and frequent assessment of student learning, a serious school-wide climate toward learning, district support, and parental participation. Recent research also points to the importance of mobilizing teachers and the community to move in the same direction, building trust among all the players, and especially creating working conditions that support teacher collaboration and professional development.

In short, a principal’s instructional leadership combines both direct actions such as observing and evaluating teachers, and indirect actions, such as creating school conditions that foster improvements in teaching and learning. How principals do this varies from school to school–particularly between elementary and secondary schools, given their considerable differences in size, teacher peparation, daily schedule, and in students’ plans for their future. Yes, keeping their eyes on instruction can contribute to stronger instruction; and, yes, even higher test scores. But close monitoring of instruction can only contribute to, not ensure, such improvement.

Moreover, learning to carry out this role as well as all the other duties of the job takes time and experience. Both of these are in short supply, especially in urban districts where principal turnover rates are high.

I am sure these university researchers are familiar with this literature. I wish them well in their efforts to pin down what principals do that account for test score improvement and incorporate that in a program that has effects on what their graduates do as principals in the schools they lead.

 

 

9 Comments

Filed under school leaders, testing

A Story about District Test Scores

This story is not about current classrooms and schools. Neither is this story about coercive accountability, unrealistic curriculum standards or the narrowness of highly-prized tests in judging district quality. This is a story well before Race to the Top, Adequate Yearly Progress, and “growth scores” entered educators’ vocabulary.

The story is about a district over 40 years ago that scored one point above comparable districts on a single test and what occurred as a result. There are two lessons buried in this story–yes, here’s the spoiler. First, public perceptions of  standardized test scores as a marker of “success” in schooling has a long history of being far more powerful than observers have believed  and, second, that the importance of students scoring well on key tests predates A Nation at Risk (1983), Comprehensive School Reform Act (1998), and No Child Left Behind (2002)

 

I was superintendent of the Arlington (VA) public schools between 1974-1981. In 1979 something happened that both startled me and gave me insight into the public power of test scores. The larger lesson, however, came years after I left the superintendency when I began to understand the potent drive that everyone has to explain something, anything, by supplying a cause, any cause, just to make sense of what occurred.

In Arlington then, the school board and I were responsible for a district that had declined in population (from 20,000 students to 15,000) and had become increasingly minority (from 15 percent to 30). The public sense that the district was in free-fall, we felt, could be arrested by concentrating on academic achievement, critical thinking, expanding the humanities, and improved teaching. After five years, both the board and I felt we were making progress.

State  test scores–the coin of the realm in Arlington–at the elementary level climbed consistently each year. The bar charts I presented at press conferences looked like a stairway to the stars and thrilled school board members. When scores were published in local papers, I would admonish the school board to keep in mind that these scores were  a very narrow part of what occurred daily in district schools. Moreover, while scores were helpful in identifying problems, they were severely inadequate in assessing individual students and teachers. My admonitions were generally swept aside, gleefully I might add, when scores rose and were printed school-by-school in newspapers. This hunger for numbers left me deeply skeptical about standardized test scores as signs of district effectiveness.

Then along came  a Washington Post article in 1979 that showed Arlington to have edged out Fairfax County, an adjacent and far larger district, as having the highest Scholastic Aptitude Test (SAT) scores among eight districts in the metropolitan area (yeah, I know it was by one point but when test scores determine winners  and losers as in horse-races, Arlington had won by a nose).

I knew that SAT results had nothing whatsoever to do with how our schools performed. It was a national standardized instrument to predict college performance of individual students; it was not constructed to assess district effectiveness. I also knew that the test had little to do with what Arlington teachers taught. I told that to the school board publicly and anyone else who asked about the SATs. Few listened.

Nonetheless, the Post article with the box-score of  test results produced more personal praise, more testimonials to my effectiveness as a superintendent, and, I believe, more acceptance of the school board’s policies than any single act during the seven years I served. People saw the actions of the Arlington school board and superintendent as having caused those SAT scores to outstrip other Washington area districts.

The lessons I learned in 1979 is that, first, public perceptions of high-value markers of “quality,” in this instance, test scores, shape concrete realities that policymakers such as a school board and superintendent face in making budgetary, curricular, and organizational decisions. Second, as a historian of education I learned that using test scores to judge a district’s “success” began in the late-1960s when newspapers began publishing district and school-by-school test scores pre-dating by decades the surge of such reporting in the 1980s and 1990s.

This story and its lessons I have never forgotten.

 

5 Comments

Filed under leadership, testing

Don’t Grade Schools on Grit (Angela Duckworth)

“Angela Duckworth is the founder and scientific director of the Character Lab, a professor of psychology at the University of Pennsylvania and the author of the forthcoming book “Grit: The Power of Passion and Perseverance.” This op-ed appeared in the New York Times, March 26, 2016. 

 

THE Rev. Dr. Martin Luther King Jr. once observed, “Intelligence plus character — that is the goal of true education.”

Evidence has now accumulated in support of King’s proposition: Attributes like self-control predict children’s success in school and beyond. Over the past few years, I’ve seen a groundswell of popular interest in character development.

As a social scientist researching the importance of character, I was heartened. It seemed that the narrow focus on standardized achievement test scores from the years I taught in public schools was giving way to a broader, more enlightened perspective.

These days, however, I worry I’ve contributed, inadvertently, to an idea I vigorously oppose: high-stakes character assessment. New federal legislation can be interpreted as encouraging states and schools to incorporate measures of character into their accountability systems. This year, nine California school districts will begin doing this.

Here’s how it all started. A decade ago, in my final year of graduate school, I met two educators, Dave Levin, of the KIPP charter school network, and Dominic Randolph, of Riverdale Country School. Though they served students at opposite ends of the socioeconomic spectrum, both understood the importance of character development. They came to me because they wanted to provide feedback to kids on character strengths. Feedback is fundamental, they reasoned, because it’s hard to improve what you can’t measure.

This wasn’t entirely a new idea. Students have long received grades for behavior-related categories like citizenship or conduct. But an omnibus rating implies that character is singular when, in fact, it is plural.

In data collected on thousands of students from district, charter and independent schools, I’ve identified three correlated but distinct clusters of character strengths. One includes strengths like grit, self-control and optimism. They help you achieve your goals. The second includes social intelligence and gratitude; these strengths help you relate to, and help, other people. The third includes curiosity, open-mindedness and zest for learning, which enable independent thinking.

Still, separating character into specific strengths doesn’t go far enough. As a teacher, I had a habit of entreating students to “use some self-control, please!” Such abstract exhortations rarely worked. My students didn’t know what, specifically, I wanted them to do.

In designing what we called a Character Growth Card — a simple questionnaire that generates numeric scores for character strengths in a given marking period — Mr. Levin, Mr. Randolph and I hoped to provide students with feedback that pinpointed specific behaviors.

For instance, the character strength of self-control is assessed by questions about whether students “came to class prepared” and “allowed others to speak without interrupting”; gratitude, by items like “did something nice for someone else as a way of saying thank you.” The frequency of these observed behaviors is estimated using a seven-point scale from “almost never” to “almost always.”

Most students and parents said this feedback was useful. But it was still falling short. Getting feedback is one thing, and listening to it is another.

To encourage self-reflection, we asked students to rate themselves. Thinking you’re “almost always” paying attention but seeing that your teachers say this happens only “sometimes” was often the wake-up call students needed.

This model still has many shortcomings. Some teachers say students would benefit from more frequent feedback. Others have suggested that scores should be replaced by written narratives. Most important, we’ve discovered that feedback is insufficient. If a student struggles with “demonstrating respect for the feelings of others,” for example, raising awareness of this problem isn’t enough. That student needs strategies for what to do differently. His teachers and parents also need guidance in how to help him.

Scientists and educators are working together to discover more effective ways of cultivating character. For example, research has shown that we can teach children the self-control strategy of setting goals and making plans, with measurable benefits for academic achievement. It’s also possible to help children manage their emotions and to develop a “growth mind-set” about learning (that is, believing that their abilities are malleable rather than fixed).

This is exciting progress. A 2011 meta-analysis of more than 200 school-based programs found that teaching social and emotional skills can improve behavior and raise academic achievement, strong evidence that school is an important arena for the development of character.

But we’re nowhere near ready — and perhaps never will be — to use feedback on character as a metric for judging the effectiveness of teachers and schools. We shouldn’t be rewarding or punishing schools for how students perform on these measures.

MY concerns stem from intimate acquaintance with the limitations of the measures themselves.

One problem is reference bias: A judgment about whether you “came to class prepared” depends on your frame of reference. If you consider being prepared arriving before the bell rings, with your notebook open, last night’s homework complete, and your full attention turned toward the day’s lesson, you might rate yourself lower than a less prepared student with more lax standards.

For instance, in a study of self-reported conscientiousness in 56 countries, it was the Japanese, Chinese and Korean respondents who rated themselves lowest. The authors of the study speculated that this reflected differences in cultural norms, rather than in actual behavior.

Comparisons between American schools often produce similarly paradoxical findings. In a study colleagues and I published last year, we found that eighth graders at high-performing charter schools gave themselves lower scores on conscientiousness, self-control and grit than their counterparts at district schools. This was perhaps because students at these charter schools held themselves to higher standards.

I also worry that tying external rewards and punishments to character assessment will create incentives for cheating. Policy makers who assume that giving educators and students more reasons to care about character can be only a good thing should take heed of research suggesting that extrinsic motivation can, in fact, displace intrinsic motivation. While carrots and sticks can bring about short-term changes in behavior, they often undermine interest in and responsibility for the behavior itself.

A couple of weeks ago, a colleague told me that she’d heard from a teacher in one of the California school districts adopting the new character test. The teacher was unsettled that questionnaires her students filled out about their grit and growth mind-set would contribute to an evaluation of her school’s quality. I felt queasy. This was not at all my intent, and this is not at all a good idea.

Does character matter, and can character be developed? Science and experience unequivocally say yes. Can the practice of giving feedback to students on character be improved? Absolutely. Can scientists and educators work together to cultivate students’ character? Without question.

Should we turn measures of character intended for research and self-discovery into high-stakes metrics for accountability? In my view, no.

17 Comments

Filed under testing