Yesterday, I joined Seyward Darby in lauding the impressive improvement in test scores during Arne Duncan's tenure as CEO of Chicago's public schools. Dana Goldstein, however, has a useful corrective on this sort of thinking:
A major problem of the testing apparatus under No Child Left Behind is that states can make up their own standards. A report from the Center for American Progress and the U.S. Chamber of Commerce found that Illinois is in the middle of the pack when it comes to the rigor of its standards. So blogger-sociologist Eduwonkette, who works handily with statistics, looked at Chicago's performance not according to Illinois tests, but according to the National Assessment of Educational Progress. Also known as "The Nation's Report Card," the NAEP is administered by the Department of Education to students across the country and, in typical American fashion, counts for nothing, despite experts' recognition of its findings as the best benchmark we've got. Eduwonkette found that under Duncan's tenure, gaps between black and white students actually grew...This doesn't meant Duncan is a bad superintendent, or that we can't learn anything from him, or that he shouldn't be secretary of education. His leadership on early childhood education, polytechnic secondary schools, and careful growth of the charter sector is a model. But we have to be very careful when we talk about student achievement and the achievement gap, because we just don't have agreed-upon ways of measuring success and failure. Indeed, that's a major problem with NCLB that I hope Duncan will address as secretary.
So what we can actually say is that during Duncan's tenure in Chicago, student achievement on an Illinois-designed test improved and the black-white achievement gap shrank. On the national test, however, those improvements were not in evidence, and the achievement gap actually grew. Which does go to show the difficulty of accurately measuring educational achievement. Teachers aren't just bullshitting when they say that the problem with merit pay and rigorous "accountability" is that there's no agreement on how we examine this stuff in a reliably useful way. That doesn't mean we shouldn't try, but it does imply a certain humility as to the implications of the results, at least until we figure out the methodological problems.Update: Via Dana comes a very cool site -- the product of a joint venture between CAP and the Chamber of Commerce -- that lets you compare the results, testing scores, and data collection of various states.