As 2013 closed out, the education world was roiled by yet another controversy over the calculation and interpretation of statistical data used to govern teachers and school services.
This controversy, coming to us from the nation’s capital, involved, according to the report in The Washington Post, “Faulty calculations of the ‘value’ that D.C. teachers added to student achievement in the last school year.”
“The evaluation errors,” noted reporter Nick Anderson, “underscore the high stakes of a teacher evaluation system that relies in part on standardized test scores to quantify the value a given teacher adds to the classroom.”
This controversy falls into a long line of previous ones stretched across the year. Now that the results from tests are being used to judge just about anything having to do with education, debates over education policy have become an endless back-and-forth over whether the data are reliable and what, if anything, they reveal.
Whether it’s “white suburban moms” disputing their children’s standardized test results or pundits parsing out the meaning of PISA, the nation has descended into a heated cross-fire over the impact and relevance of education statistics brandished by “reform” advocates.
While these arguments rage over the relevancy of test scores in policy making, some are now questioning, to use the operative phrase in Anderson’s sentence above, whether it’s even possible or preferable “to quantify the value” in education.
The whole idea that teaching and learning is a pursuit that can be expressed and judged by numbers and rankings, which seems to be a forgone conclusion to policy makers and economists, is increasingly an unsettled matter to most Americans. What they see instead more and more looks like a nation turning its back on the well being of students – especially those who are most in need.
The Impact Of IMPACT?
The reported problems with D.C.’s teacher evaluation system are just the latest example of the problems that occur when test data become a source for policy direction.
The mistake affected 44 teachers, or about 10 percent of faculty the calculations apply to. But the overall effect is way more significant when taking into account the numbers of students who are linked to each teacher.
Further any report of flaws with the teacher evaluations in D.C. is apt to reverberate across the country. The district’s system, known as IMPACT, was created under the administration of Michelle Rhee and has been touted by education advocates aligned with Rhee as a model for the nation.
As the Post’s Valerie Strauss, who also reported on the IMPACT controversy, noted, “Such evaluation has become a central part of modern school reform … In some places around the country, teachers received evaluations based on test scores of students they never had.”
The Truth Behind TUDA?
The reported problems with IMPACT fell on the heels of yet another statistical data dump from the week before.
That statistical disgorge is known as the Trial Urban District Assessment, or TUDA, which analyzed the performance of students in some cities with populations of 250,000 who took part in the National Assessment for Educational Progress.
The education reporter for The Huffington Post, Joy Resmovits, noted, “Washington, D.C. – a standard bearer for what’s known as the education reform movement since former school chancellor Michelle Rhee’s tumultuous tenure at D.C. Public Schools – was the only city to show score increases in both grades in both subjects since 2011.”
So Michelle Rhee’s organization, StudentFirst, immediately issued a press release claiming D.C. schools as one of the “bright spots” that show “what we can learn” from TUDA. First among the lessons was, you guessed it, IMPACT.
Of course, it’s entirely unclear how students analyzed by TUDA – just fourth and eighth graders in two subjects – were in any way affected by IMPACT. Other explanations for D.C.’s superior results seem equally if not more plausible.
For instance, Randi Weingarten, president of the American Federation of Teachers, pointed to changes in early childhood education and the city’s demographics as factors. “This is the first group of 4th graders that actually had pre-kindergarten. So what this is saying to us is that all-day kindergarten and prekindergarten is one of the most important investments.” And the city is ” becoming more and more middle class.”
Meanwhile, as Resmovits noted in her article, “Statisticians warn against citing these gains as evidence of efficacy or inadequacy in debates about particular school reforms. ‘It’s not a causal model,’ said Mark Schneider, a vice president at the American Institutes of Research, who used to oversee the Education Department’s research arm. ‘I get very leery when people say that ‘This shows that X happened.'”
Nevertheless, there seems little hesitancy to jump into these statistical suppositions games and then use them to craft whole policies for our children.
Perhaps no assessment data draws more media attention and generates more causal explanations derived from test results than the Program for International Student Assessment, or PISA.
This year’s PISA results were no exception as Secretary of Education Arne Duncan staged PISA Day, a media event that spent most of five hours arguing that the scores were reasons to get behind his pet policies. And Michelle Rhee took to the pages of Time magazine to use the PISA scores as an opportunity to claim the countries that are excelling academically are doing similar things to what she espouses.
As Rutgers professor Bruce Baker explained at his blog, the primary use of PISA data in the public policy discourse is “to ram through ill-conceived, destructive policies.”
Baker – whose edu-stat crunching has been compared to “Nate Silver’s influential and statistically nuanced election forecast blog posts” – concluded about PISA, “Except for showing that economic conditions matter … simple rankings of countries by their PISA scores aren’t particularly insightful.”
“Nothin’ brings out good ol’ American statistical ineptitude like the release of NAEP or PISA data,” Baker continued in a different blog post. Any gains or losses on these tests, Baker contended are less a matter of proving a school system is doing better “because it allowed charter schools to grow faster, or teachers to be fired more readily by test scores,” and more a simple matter that swings in results “are cohort average score differences which reflect differences in the composition of the cohort as much as anything else.”
To mock the whole idea that these test results provide grand insights into “what works” in education, Matt Chingos, writing for the conservative education policy center Education Next, had a bit of year-end holiday fun and contrived “a rigorous empirical analysis that measures the causal effect of Christmas on student achievement.” His conclusion – including the mandatory Excel graph! – that “student learning rises more or less in lock-step with the amount of holiday spending” is about as convincing as what Duncan, Rhee, and other “reform” leaders pull from the data. But that doesn’t seem to stop them.
Testing data’s absurd level of impact on the nation’s entire education endeavor would be a laughing matter if there weren’t such tragic situations occurring on the ground in schools.
Back To Reality
While the nation’s education leaders get lost in a numbers game, there’s ample evidence from real life experiences that our children’s education destinies are becoming more endangered.
As The New York Times recently reported, “Many schools face unwieldy class sizes and a lack of specialists to help those students who struggle academically, are learning English as a second language, or need extra emotional support.”
According to the article elementary class sizes in parts of California have swollen to 30 students and more. The public school district in Dallas, Texas this year sought state permission for over 200 schools to increase class size of 22 students for kindergarten through fourth grade. Some high schools in Charlotte-Mecklenburg County in North Carolina have class sizes of as many as 40 students. And in Cobb County, Ga., average class sizes in fourth and fifth grades are now about 33 students.
The problem arises from the fact that “public schools employ about 250,000 fewer people than before the recession” while enrollments have increased by more than 800,000 students.
“The cutbacks have been particularly pronounced in less affluent school districts,” Times reporter Motoko Rich noted.
On nearly the same day, another New York City newspaper, The Daily News, reported on the alarming state of education services to minority students in the system. “Black and Hispanic high school students are “getting stiffed,” wrote the reporter, based on data provided by the school system.
“On average, white and Asian students attend high schools with twice as many Advanced Placement courses and almost twice as many science labs compared with schools attended by black and Hispanic students.
“Black and Hispanic students also have fewer science subjects available in their high schools and fewer arts classes and rooms … They’re also less likely to have a library, medical office or gym in their school buildings.”
Similarly, a report in a Boston news outlet looked at schools in California and noted, “Hispanic students in general are getting worse educations than their white peers. Their class sizes are larger, course offerings are fewer and funding is lower. The consequence is obvious: lower achievement.”
The Times article caused education historian Diane Ravitch to write on her blog, “We hear so-called reformers proclaim about the importance of teacher evaluation, merit pay, and test scores, but I have yet to hear any of them complain about budget cuts and lack of staff for the arts, physical education, foreign languages, libraries, and so on … How are schools supposed to enact any of their proposals when teachers are stressed out with crowded classrooms?”
2014: A Chance To Change The Conversation
When the last Great Big Education Innovation called No Child Left Behind descended on America’s beleaguered schools, the intention was to address the variance in test score data among K-12 students.
NCLB was supposed to close what was, and still is, called the Achievement Gap. But it’s now widely understood that the whole enterprise was an utter failure. The best that NCLB proponents can offer is that it “woke the country” to the stark differences between the academic attainment of African American and Hispanic school children and their white and Asian peers.
But anyone who needed “awakening” then has doubtless fallen back into slumbers as the country has drifted further and further into a vast sea of segregated schools and education inequality.
Rather than seeking a different course of action, reform-minded policy makers doubled down and brought us even more destructive ways to use test score data, while real experiences of students in actual classrooms – especially in our most financially strained, underserved communities – were ignored.
2014, an election year, offers an opportunity to change that conversation. The American people are ready for it.