Iin February of 1997, when Bill Clinton made national school standards and testing a centerpiece of his second-term domestic program, it became one of the biggest applause lines of his State of the Union address. What could be more self-evident for a nation convinced that its schools, if not actually failing, were running a poor second (or third, or tenth) behind Japan, behind Singapore, behind Taiwan, behind Korea—and that if something weren't done, our economic competitors, with better-educated people and more highly skilled workers, would beat our brains out [see "Are U.S. Students Behind?"]? Tests geared to national norms—or better yet, world standards—would inform parents about how well, or how badly, their kids were really doing. No more Lake Wobegon effect; no more false optimism from local school administrators trying to look good.
Nonetheless, no one should be surprised that Clinton's proposal for voluntary national tests—reading in the fourth grade and math in the eighth—is now in deep trouble. Only seven states have committed themselves to the President's testing program, and even some of them are said to be having second thoughts. More important, Congress, pressed by both the left and right, has put a moratorium on further federal spending for national testing and blocked any field trials of the tests until at least this fall. Congress has also commissioned the National Research Council to study the possibility of creating a device that would permit the scores on the standardized tests now used in many states and school districts to be equated into a single score, thus obviating the need for any new national tests. And it has wrested to itself the power to authorize any testing program. National testing, said House Appropriations chairman Robert L. Livingston last November, has been stopped "in its tracks."
For Clinton—and for Al Gore, who might have to run on, and live with, the consequences of Clinton's program—that could be a blessing in disguise, not only because of the technical and ideological booby traps buried in the actual testing proposal and the bitter political and pedagogical fights it will generate if the tests are ever given, but because of the broader national ambivalence about tough academic demands against which those problems will resonate. Do we really want a system that is demanding, meritocratic, and characterized by high expectations in such things as college admission for students, pay for teachers, and promotions for administrators? Or would we prefer an egalitarian system of perpetual second chances and endless opportunity for all comers? We pay great lip service to standards. We become far more timid and divided when they stare us in the face.
TOUGH STANDARDS, TOUGH CONSEQUENCES
When Clinton's testing proposal ran into trouble, Chester Finn, who was assistant secretary of education in the Reagan administration, remarked that the right doesn't like anything with the word "national" and the left doesn't like anything with the word "testing." But in fact, similar battles are being waged in the states, both over testing programs and standards and over consequences. As the late Al Shanker, longtime president of the American Federation of Teachers, used to point out, having tough academic standards without consequences is pretty meaningless. Yet real consequences for children are not something Americans are very comfortable with. It is one of the glories of the American system that higher education is so widely accessible. But this very accessibility will also forever undermine the institution of really tough school standards—if the standards for advancement and acceptance are raised, a post-high school education might no longer be so accessible. What's more, higher standards could have a demotivating effect on a significant portion of the school-going population.
Dozens of states—from New York to Colorado, from Florida and Wisconsin to Texas and California—have been trying to upgrade their own standards, often in conjunction with new tests. Although some new standards, such as those in Virginia, have been widely praised, there is no way to know how well they will ultimately be accepted, nor how much difference they will make. And while the larger menu of serious academic courses gradually imposed in many states in the 1980s has almost certainly had an effect in raising academic achievement, those reforms were mild compared with what's being proposed now. As higher education standards become the educational corollary of the wave of tough-on-crime legislation that marked the late 1980s and early 1990s, the race is on to determine who can back the most demanding academic requirements with the most uncompromising consequences, regardless of the schools' ability to implement them or parents' willingness to accept the ultimate results.
In New York last November, the regents, with little study or debate, approved a regulation requiring every student to take three years of a foreign language and pass a state exam in that language in order to graduate from high school. In the face of widespread protests, the rule was quickly rescinded. People thought, said Carl T. Hayden, the chancellor of the 16-member board, "that we were bereft of our senses." But another new rule, under which every student will have to pass the same state regents exams in English, math, science, history, and social studies now required only for the 40 percent of New York graduates who receive regents diplomas—meaning generally those who intend to go to college—remains in place. Even if all students were capable of passing those tests, where would the teachers come from? (And if all passed, what sort of standard would it be?) In New York, as in other states, the full impact of the requirements will not be felt for some time: They were safely deferred, and will only go into effect for those graduating in the year 2004, by which time many of those who voted for them will, like the Washington politicians who backload budget cuts, be elsewhere. But since state education commissioner Richard Mills is already proposing to give regents exams in Spanish, Haitian-Creole, Russian, Chinese, and Korean, and since during a "transitional period," the passing grade will be 55 (on a scale of 100) instead of 65, and students will have six hours instead of three to take the exam, the drift seems clear. In the end, this may not turn out to be a toughening of standards for all students so much as a watering-down for the best.
In Michigan last year, more than half the juniors in the exclusive suburb of Birmingham, nearly all college-bound, opted out of the state's new High School Proficiency Test—they got waivers designed for those with "severe disabilities"—because they and their parents saw only risks, not benefits, in taking it. In the working-class town of Muskegon Heights, school officials urged parents of seventh graders to ask for waivers on another state test because their children were not likely to do well and would thus suffer damage to their self-esteem.
T he hazards may be even clearer in California, where the legislature and governor enacted a series of bills over the past couple of years calling for statewide standards in a whole range of fields and instituting a multiplicity of state testing programs. As might be expected, there has been no end of battles between the state education establishment and the politically appointed state board of education, each of them bolstered by legions of experts, over what those standards should entail.
Should math, for example, focus on what some people call math facts—rote learning of the multiplication tables, memorization of geometric theorems and formulas—or should it include a large dose of problem solving, conceptualization, and "constructivist" math, allowing children to "discover" the basic principles of mathematics for themselves? In what grade should the use of calculators first be permitted? To what extent should math instruction in the middle and upper grades be "integrated"—meaning that algebra and geometry are taught not in the traditional way as discrete subjects (with algebra in one year and geometry in another) but combined over two or three years?
In the teaching of reading, to what extent should the focus in the early grades be almost exclusively on phonemic awareness, on phonics, on spelling, and on the other structural basics involved in decoding words and sentences, and to what degree should it tolerate, perhaps even encourage, invented spelling and reliance on context, including pictures and other "whole language" devices, in order to foster reading for pleasure and understanding as soon as possible?
The thoughtful people in the various disciplines argue endlessly, and quite reasonably, that these are not either-or propositions. But sweet reason rarely informs these disputes between true believers in what often seems more like a religious war than a disagreement over academic emphasis. Regardless of who prevails, the other party will wage guerrilla war on its policies—in the legislature, in the classroom and in the schools of education, and sometimes in the courts—until the climate changes. In the early 1980s, the pendulum swung toward the basics; in the late 1980s, it swung toward comprehension and problem solving. Now it swings back. Plus ça change. Recently in California, an official state commission proposed a new set of elementary school math standards (which were touted as "world-class" because they were allegedly copied from those of Japan and Singapore, this year's fashion plates in education). When the conservative state board of education, which is supposed to have the last word in the process, tinkered with the standards to make them more precise and testable, the elected state superintendent of schools, a liberal Democrat, issued a loud public letter accusing the board of "dumbing [the standards] down." In the end, of course, the board does not have the last word. It is the bureaucrats, the teachers, and the parents. The longest distance in the world is between an official state curriculum policy paper and what goes on in a child's mind.
N ot surprisingly, similar battles are being fought over the testing programs. To what extent should the test be an "authentic assessment" focusing on problem solving, constructed answers to open-ended questions, and other "performance-based" measures—student portfolios, including artwork and essays done during the course of a class—and to what extent should it comprise only "objective" multiple-choice fill-in-the-bubble answers to specific questions? Should the testing programs generate individual scores, as well as average scores for schools, districts, and perhaps individual classrooms? Or should testing be strictly a diagnostic instrument to be used by teachers and parents to determine a student's strengths and weaknesses? Should the test be based only on the standards established for that state or district, or should it be founded on broader "world-class" criteria? Should it be given only in English, or should it also be given in the primary language of children who have been in this country three years or less, or five years or less? And who, besides the severely handicapped, should be excused?
In the battle over Clinton's testing program last year, the last two questions were major issues for organizations like MALDEF, the Mexican-American Legal Defense and Education Fund, just as they are issues in states like California, New York, and Texas, with their large and growing proportions of limited-English-speaking children. A number of big-city districts, among them Los Angeles, Houston, and El Paso, have announced that they will not participate in Clinton's reading test because it will not be given in Spanish. In California, where a new state law requires that all students be given a battery of standardized tests beginning in the spring of 1998, a group of urban school superintendents threatened to sue the state in the federal courts, charging that the test discriminates against minorities and violates federal civil rights laws. Meanwhile, organizations like FairTest in Cambridge, Massachusetts, argue—sometimes with considerable success in the courts or before civil rights agencies—that virtually all objective, short-answer tests, from the SAT down, don't measure what they pretend to measure and therefore are inherently unfair and distort teaching and curricula.
DEFINING THE SAT DOWN
Education reform is in fact heading simultaneously in diametrically opposite directions. While the political push in the K-12 schools is toward more conservative "tough" standards, with the tests to back them up, public universities are moving away from test-based criteria in their admissions process, or have already done so. The galvanizing force in both directions is the raft of new legal prohibitions against the use of race-based affirmative action in university admissions, either because of state law and governing board policies (as in California, where both Proposition 209, passed by the voters in 1996, and a regents decision adopted in 1995 now bar all race-based criteria) or because of a federal court order (as in Texas, where a three-judge panel of the Fifth Circuit Court of Appeals, in Hopwood v. Texas, did the same thing).
In both Texas and California, affirmative action had been written into numerical admissions formulas—generally a combination of grade point average and test scores—to mitigate their negative effect on the ability of blacks and Latinos to get into the more selective public institutions such as Berkeley, UCLA, and the University of Texas Law School. In Texas's case, the law school had created a whole separate admissions system to get minority enrollment up.
But once these states were precluded from considering race, they were confronted with rapidly declining minority enrollments in their most selective institutions—what College Board president Donald M. Stewart called "a potential wipeout that could take away an entire generation." Thus both states have embarked on a search for other means of maintaining ethnic diversity. In Texas, the governor and legislature enacted a law last year requiring the state university system to accept the top 10 percent of the graduates of all Texas high schools, regardless of their SAT scores, thereby ensuring that at least some students from heavily Latino high schools, for example, will gain admission. California, whose ethnic politics vis-à-vis Latinos in recent years have been far less tolerant, and whose premier state university campuses are far more selective, is not likely to go that same road. Nonetheless, similar legislation, requiring the University of California to admit the top graduates of every California high school, regardless of their records and regardless of the academic rigor of the school, is already pending. Meanwhile, official UC committees are searching for ways to change the admissions formula, not to raise standards, but to preserve diversity.
In virtually every case, downgrading the SAT is high on the list of possible options. In the fall of 1997, when the official UC president's Task Force on Latino Eligibility urged the university to drop the SAT altogether, UC president Richard Atkinson called it "an interesting" proposal worthy of consideration. (Among other things, the task force said, the change would increase the number of Hispanics eligible for UC enrollment by about 50 percent.) And while UC is not likely to drop the SAT altogether, Berkeley's highly selective Boalt Hall Law School, faced with a sharp drop in minority enrollment (not to mention the threat of a federal civil rights investigation) has already ended its practice of weighing applicants' grade point averages according to the academic competitiveness and rigor of the institution that the applicant attended. Henceforth, a B in cross-cultural studies from a state teachers college is worth the same as a B in math or economics from MIT. Thus while one set of state educational institutions is laboring mightily to develop a tough testing program, another is looking for ways to downgrade the consequences of weak grades and low test scores. FairTest, which keeps track of such things, now counts some 280 colleges that don't require the SAT for admission, a number that has increased by nearly a third during the past three years.
T here are other signs of the dilution of standards. Two years ago, under pressure from a federal civil rights complaint, the College Board revised the PSAT (the Preliminary Scholastic Aptitude Test), the junior version of the SAT, principally by adding an "expanded" writing skills section in order to raise the scores of the girls who take it and thus increase the percentage who get National Merit Scholarships, which are largely based on PSAT scores. Meanwhile, the Educational Testing Service, which administers the SAT and many other university admissions tests, and which used to defend them vehemently, is issuing statements warning against overreliance on the tests and explaining that "equating scores with merit . . . supports a mythology that is not consistent with the reality of data."
At the same time, almost unnoticed in yet another arena, a group of minority educators in California is in a federal appeals court pursuing its challenge against a state law requiring all public school teachers and administrators to pass CBEST, the California Basic Educational Skills Test. Requiring that teachers pass the test is supposed to ensure that those who go into the classroom are at least as proficient in math, reading, and writing as the average tenth grader is expected to be. The plaintiffs in the case, among them some veteran school administrators who are not too embarrassed to disclose that they failed the test (which most eighth graders can probably pass) six or eight times, assert that the test has a disparate impact on minorities. Because it is not precisely job related—presumably guidance counselors don't need to know elementary algebra and gym teachers don't need to read much—they claim it's discriminatory and violates federal law. The state, arguing that teachers should, at the very least, not be models of ignorance in the things all students are supposed to master, has already prevailed in a lower court. But the fact that the challenge is still being pursued after nearly a decade of litigation ought to give some indication of how hard it is to upgrade anything in American education.
Predictably, admissions officers are searching for alternative ways of assessing university and professional-school applicants. But in an era when both policymakers and the courts are systematically rolling back the race-based criteria that had been used to get around the large gaps separating black and Latino test scores from white and Asian scores, the real issue is that virtually any set of academic criteria that produce an undesirable social result are likely sooner or later to come under attack as unfair or inadequate.
Moreover, virtually any test will run into pre-existing divisions between the party of hard-nosed phonics, math facts, and other basics, and the liberal reformers, particularly those in teacher-training institutions and states' departments of education, who believe that only open-ended questions, creative answers, and other performance-based assessments provide a true picture of a student's ability.
These issues are compounded further by the administrative difficulties inherent in any large-scale high-consequence testing program—the dangers of widespread cheating, the question of who may be excused, the cost of producing and testing multiple forms of the test to increase security and make certain that next year's students (or those who take the make-up exam this year) can't be drilled with specific questions and answers. These are not hypothetical problems. Almost every day brings yet another report about some breach of test security—in some cases just an individual teacher or principal changing test results; in some, as on the qualifying exam administered by the Educational Testing Service that is given to would-be school principals in Louisiana, wholesale cheating by hundreds of candidates over long periods of time. That the cheaters were all people aspiring to be leaders in the state's public school system made the case, reported in extensive detail by the New York Times last fall, all the more telling.
Judging from the battles of the past decade, it appears that, on the national front anyway, we're stuck in an endless cycle. In the abstract, the notion of higher standards and better testing generally promoted by educational conservatives is widely embraced: Who can be against higher standards and more reliable tests? But as the tests are developed and actually given, the tenor changes. The current fight over Clinton's standards and tests comes on the heels of the fight over the development of federally sponsored model content standards in various fields, from English to history to mathematics, prompted first by the Bush administration in the late 1980s and early 1990s. As soon as some of those standards were published—the history standards produced by Gary B. Nash and his associates at UCLA were the most notorious example—they were loudly repudiated by the very people who had first sponsored them. It was Lynne Cheney, chairman of the National Endowment for the Humanities, who had strongly encouraged and helped fund the history standards. As soon as a draft appeared, Cheney, soon followed by a virtually unanimous U.S. Senate, denounced them as fatally tainted with a political revisionism that devoted more attention to the depredations and minority victims of American history and expansion than to American achievements and heroes. The draft standards for English were so suffused with jargon and platitude that the federal government stopped the funding before the project was finished. Ultimately the history standards were revised to restore more traditional elements, but by then the fight had fatally undercut the attempt to craft national standards in this fashion. It was Cheney who in 1991 urged the development of national tests in science, history, and other major subjects; it was her fellow Republicans in Congress who led the charge against even the mild version of testing that Clinton proposed last year.
COMING FULL CIRCLE
With Clinton's proposal for national tests—and thus standards—in reading and math, the circle is coming around again; indeed, it's in its second phase, since national standards had also been at the heart of Clinton's all-but-forgotten Goals 2000 program (which was itself an echo of a much-ballyhooed Bush administration initiative). The tests are supposed to be based on the National Assessment of Educational Progress (NAEP), a set of exams in various fields—from math and science to reading and history—that have been given to a sample of American students for the better part of 30 years. Launched in the Kennedy-Johnson years, NAEP was to give the nation a report card on how well, on average, its students were doing. Over those years, it developed a reputation as the gold standard of testing, and its results were widely cited in the larger debate over the performance of U.S. schools. But NAEP was never designed to provide individual scores, and was never pegged to any "world-class" standard. Indeed, because of fears of federal meddling in state and local curricula, the initial NAEP frameworks were purposely mushed so that no one could ever charge that they might be the first step toward a federal curriculum.
In the years since, the tests have been revised and tinkered with (most recently to provide average state scores in reading and math), although the full test has never been made public. NAEP has thus become, in the words of one critic, "an ever-expanding black box with contents that are thoroughly understood by an ever-shrinking number of specialists."
But if the new tests are developed along the NAEP model, and if their full contents are disclosed each year after they're given, as Clinton proposes, it may not be just conservatives who will see red. Because some of the items ask students to disclose personal experiences and feelings—for example, "How is this story like or different from your own personal experience?"—and because the framework on which the test is based is full of whole language assumptions about good readers having "positive attitudes about reading and positive self-perceptions of themselves as readers," a lot of people may wonder what such items have to do with the testing of reading skills.
For many teachers and parents, such questions and assumptions may seem all too familiar. But in California five years ago similar questions led to a loud public battle and, ultimately, to the scuttling of an ambitious new performance-based state testing program called CLAS. Conversely, any test composed predominantly of hard-nosed, fact-based questions will generate quick opposition from people like Bob Schaeffer of FairTest who regard such items not only as inadequate measures of student achievement, but as a regressive force pushing public schools back to a basics-only menu of rote learning. Meanwhile, many testing professionals worry that if the Clinton program crashes as CLAS did, it may not only destroy the credibility of NAEP but undermine confidence in testing generally. Last year, the worry was great enough that the test publishers, who one would expect to benefit hugely from a national program, worked diligently behind the scenes to get Congress to delay or even stop it. Clinton and Education Secretary Richard Riley, in their efforts to generate support for the testing proposal, still argue that there is nothing inherently controversial about it. "Reading is reading," they often say, "and math is math." But in America, you'd better not bet on it.