On January 1, the National Institutes of Health (NIH) opened an electronic archive on the Web that is intended, eventually, to house or link all biomedical research produced in the United States. But PubMed Central, as the archive is known, has drawn fire from leading figures in academic medicine for threatening to disrupt the established methods of evaluating research for publication. The controversy over PubMed Central is a tale of new technology versus old, of innovation and inertia--and of knowledge in the age of the Internet.
The controversy began last June almost immediately after Harold Varmus, then director of NIH, proposed the new archive as "an electronic public library" that would give medical professionals and the general public "absolutely free access to the entire repository of information, with no toll booths, no hesitation." The National Library of Medicine, which is part of NIH, has for years provided online information, including the vast bibliographical database MEDLINE. The new archive would provide access to the full text of articles and research reports even before they reached print, and it would exploit the multimedia capabilities of the Internet. As Varmus put it, the new electronic literature would be "responsive to changes in the way we do science, with images there on the screen, movies, large data sets, the ability to move from one article to another, to take other people's data and look at it from your own perspective."
This vision was enormously appealing to many in the medical community, and in fact, other groups of scientists--notably theoretical physicists--have had access to research via the Internet for some time. But the idea provoked fierce criticism from America's leading medical journal, The New England Journal of Medicine, which in the title of an editorial characterized the original proposal for PubMed Central as "A Potential Threat to the Evaluation and Orderly Dissemination of New Clinical Studies." According to the Journal, one size simply does not fit all in the sciences. Doctors, unlike physicists, are responsible for patient care; their orientation is practical, and they are not especially sophisticated about research methods. Consequently, any plan to archive clinical studies without submitting them to stringent peer review would set the stage for clinical disasters. "The best way to protect the public interest," argued the Journal, "is through the existing system of carefully monitored peer review, revision, and editorial commentary in journals."
Although this criticism has led to modification of the original conception of PubMed Central, the controversy persists. The question that particularly divides critics and enthusiasts of PubMed Central is the dissemination of unreviewed research results. But the larger issue has to do with new possibilities for advancing knowledge--and whether traditional institutions can adapt to them.
Should Public Policy Support Open-Source Software?
Find out more by reading the TAP Online Controversy.
The Open-Source Way
The spread of the Internet has made electronic publishing an irresistible force in field after field of intellectual endeavor. By allowing for rapid exchange of information, disseminating crucial resources cheaply, and facilitating peer review, electronic publishing naturally supports collaborative work among people who are physically separate from one another.
Software itself is an example. The Internet has been the host for the highly collaborative method of software development known as "open source." Instead of jealously guarding source code (the human-readable instructions that make up software) in a proprietary way, open-source programmers put their work online to encourage other programmers to work on it with them. As the open-source slogan has it, "Given enough eyeballs, all bugs are shallow." Or, as Microsoft put it in an internal memo that flatters open source as much as its most ardent advocate could wish, "The ability of the OSS [open source software] process to collect and harness the collective IQ of thousands of individuals across the Internet is simply amazing."
Open source is both a child and a parent of the Internet. The Internet provides the means for open-source programmers to compete with proprietary software producers. And crucial chunks of the Internet itself--operating systems, server software, mail programs--are open source.
The lessons of open-source software apply to other uses of electronic publishing. If a given field of study puts all or most of its work online where it is available for review, criticism, and development, that field has adopted an open-source style of work. And if this occurs in many fields simultaneously, it makes sense to think of open-source software not as a special case, but as a sign of a change in knowledge as a whole, a movement toward electronic collaboration. You don't need to cave in to source-code mysticism to note that something like this is occurring. Consider the following:
In 1991 Linus Torvalds put source code online that was to mature into the Linux operating system, one of the most conspicuous successes of the open-source method. Though this act has been elevated to the status of a digital creation myth, Torvalds later remarked that he had no idea at the time that he was doing anything special. He was simply allowing fellow programmers to inspect his fledgling efforts. But as fellow hackers e-mailed back fixes and extensions, Torvalds soon realized, "Wow, not only did people want to see the source, but it worked extremely well as a development model."
That same year, Tim Berners-Lee, a software engineer and physicist in Switzerland, developed the software for what was to become the World Wide Web and distributed it to fellow scientists at the Centre Européen pour la Recherche Nucléaire (CERN). As Berners-Lee has put it, the world of science at CERN was a babel "of incompatible networks, disk formats, data formats, and character-encoding schemes, which made any attempt to transfer information between computers generally impossible." Berners-Lee hoped the Web would spell an end to "an era of frustration," and it soon went global. Because Berners-Lee consistently fought to keep Web protocols open, the Internet could provide "the feedback, stimulation, ideas, source-code contributions, and moral support that would have been hard to find locally. The people of the Internet built the Web, in true grass-roots fashion."
The year 1991 was a milestone in the use of the Internet by scientists for another reason: That was the year Paul Ginsparg, a physicist at Los Alamos National Laboratory, started an electronic archive that quickly became the central database for work in physics. "By permitting more rapid dissemination of results, and by facilitating collaboration at a distance," Ginsparg argues, the archive at Los Alamos has accelerated progress in physics. It is now being expanded to include mathematics and other disciplines.
Los Alamos's model was a direct influence on PubMed Central. It was only a few months after hearing Ginsparg talk about the physics site that Varmus made his proposal, only to run into criticism that it would undermine the peer review process. "Most often," Marcia Angell, editor in chief of The New England Journal of Medicine, explains:
reviewers disagree with each other, then [we go] back to the authors for revision, then back to the reviewers again and so forth. And these revisions are not cosmetic, they are substantial. Sometimes they mean studies have to have a longer follow-up, sometimes more patients or a different control. Occasionally, this means the conclusion is totally different. We will get a paper that initially says pill A is better than pill B and in the end the conclusion is, we can't find any difference. In our view the study isn't completed until it's been through this process.
As originally conceived, PubMed Central was going to tap into the peer review process, but in a way that would have been catastrophic for the journals in their current form. Varmus's initial idea was to request clinical studies from the journals at the point where they had survived peer review but were still up to 10 weeks away from appearing in print. In effect, PubMed Central wanted to scoop the journals--and with their own material. As Angell puts it, "Why, then, would anybody subscribe to our journal? So we die, we die. We do this charitable function of review, and then we go out of business."
Angell believes that Varmus, who won a Nobel Prize for work on cancer-causing genes, tends to see the biomedical community in his own likeness, as a community of scientists in which researchers and their readers have the same expertise. This identity of author and reader was a precondition that Ginsparg himself had put forward when attempting to define what fields were most suitable for an archive like that of the physics site at Los Alamos. When "author and reader communities (and consequently, the referee community as well) essentially coincide," he wrote, "free electronic dissemination of unreviewed material" had the best chance of succeeding. In such cases, readers would need no referee; they were capable of reviewing material on their own. But, as Angell points out, the articles in medical journals use statistical and other research techniques that most practicing physicians are ill-equipped to judge.
A second draft of PubMed Central went a long way toward meeting this objection. Submissions would no longer be solicited from journals at the point of acceptance, but only when published or later. This part of the archive would draw on the expertise of participating journals--and participation was always understood as voluntary--without undermining the journals. But the second draft carried over a provision for a second tier of unreviewed studies quickly glossed for appropriateness and stored in a clearly defined section of the archive known as PubMed Express. This second tier, according to Varmus, would make available "reports that currently don't make their way into the public domain: negative outcomes of clinical trials, gene therapy efforts that don't succeed, and other kinds of clinical data sets that are too large for current publishing." Varmus acknowledges that "unreliable" information would get into this part of the archive, but he judges scientists, doctors, and the public competent to observe well-marked borders between reviewed and unreviewed material.
Subscribe to The American Prospect.
Though not yet implemented, PubMed Express incites strong opposition in some quarters. For NIH to make available this second tier of material is "irresponsible," argues Angell, who worries that drug companies will soon appropriate it for studies favorable to their products. The pharmaceutical industry, however, already finances much of the research in medical journals, which are not much of a barrier to promotional efforts. Indeed, a recent article in The Journal of the American Medical Association showed that drug companies routinely skew a statistical technique known as meta-analysis by publishing studies favorable to their products multiple times, merely changing the names of authors. JAMA's proposed solution, a central online "meta-registry" nailing down the publication history of all clinical trials, is the kind of remedy that an archive like PubMed Central would be well-suited to implement.
In an interview, I suggested to Angell that it would not be long before medical professionals began to sift through and evaluate unreviewed material. She replied, "You are saying, 'This is a problem, and surely there will be a solution to it.' But why have the problem? Why introduce a problem if then you have to wait for a solution?" I responded, "I think you are underestimating the power of the Internet." She countered quickly, "That's like saying I'm underestimating the power of the universe." And I had to laugh with her at how lame my Internet remark had been. Later I was to reflect it wasn't so lame at all.
Charlotte Bell and Keith Ruskin, two anesthesiologists at the Yale University School of Medicine who promptly objected to the Journal's attack on PubMed Central, have already shown what electronic publishing can do for medical research. Bell and Ruskin direct a Web site called GASNet for doctors in their field. With over 1.5 million hits a month and the number increasing rapidly, GASNet is becoming the kind of resource for anesthesiology that Ruskin, who founded the site in 1993, hopes PubMed Central will become for medicine as a whole. The site provides a wide variety of information and educational resources, and tries to take maximum advantage of bandwidth--it offers real video and audio--while remaining accessible to "the slowest setup."
As Bell and Ruskin see it, the future of medical research is going to be electronic. If doctors don't take steps to shape the new technology to serve patients and physicians, control will be seized by commercial interests--"pharmaceutical and biotechnology companies with virtually unlimited resources [that] are already using principles of marketing and advertising, rather than those of science and scholarship, to disseminate biomedical information."
Bell cites video and virtual reality (VR) as two obvious advantages of an evolving electronic delivery system. "Why should a doctor review how to do an operation in a textbook?" she asks. "He should see the operation. We can do it with VR gloves so he can see it and feel it with his fingers the night before he operates. That's a whole new world! That's how we should be teaching medicine." Bell does not worry about the unreviewed section of PubMed Central becoming a depository for junk science or biased work. "You are so much more vulnerable if the whole world can see and review your work," she said. "PubMed Central is a forum where you get ultimate peer review." Moreover, the new electronic medical libraries have global reach; 20 percent of hits to GASNet come from developing countries.
Open Source and Medicine
The open-source approach to business organization--giving away the final product, source code--may appear to be selfdefeating. But rather than charge for the software itself, open-source programmers and distributors charge for the documentation and support--in a word, the expertise--they are best equipped to deliver. Astonishingly, this model has caught fire on Wall Street, where initial public offerings by open-source distributors such as Red Hat and VA Linux have set new records.
When information gets ever freer--and it's useful, in this context, to think of both source code and biomedical research as information--knowledge becomes the crucial add-on. In this sense, Linus Torvalds, Paul Ginsparg, Tim Berners-Lee, and Harold Varmus are all engaged in the same enterprise, freeing up information electronically and making it easier to access, share, and build upon. The open-source business model takes explicit advantage of this dynamic. So could biomedicine. As digital networks develop, the role of the major medical journals as the exclusive purveyors of certain kinds of data may well become obsolete, but their role in framing and interpreting the data will be ever more in demand. Therein lies a viable survival plan, if the journals' editors and owners see the possibilities.
GASNet shows how it can be done. It will electronically publish material that passes its peer review process and then submit the material to PubMed Central, which will make it available in the central archive. At the same time, GASNet will expand the educational and editorial role that has made it so popular among anesthesiologists. Other electronic journals will no doubt take a slightly different tack, mining the PubMed Central archive for material that has been overlooked. The New England Journal of Medicine still has "no immediate plans" to participate in PubMed Central. But at a time when electronic publishing is breaking down walls between scientists, it remains to be seen how long any medical journal will be able to stay behind its own walls. ¤