Astronomer and longtime professor David J. Helfand has taught scientific habits of mind to generations of undergraduates at Columbia University. His first book, “A Survival Guide to the Misinformation Age: Scientific Habits of Mind,” draws on his teaching to elucidate key concepts of data analysis and quantitative reasoning useful to journalists in writing about science and evaluating scientific papers.
“Building a Cathedral”
Science is neither a collection of unquestioned facts nor a simple recipe for generating more facts. Rather, it is a process of inquiring about Nature, and given that Nature is not only much bigger than humans but has also been around a lot longer, it should come as no surprise that we haven’t finished the job: there is no complete scientific description of the universe and all it contains. The gaps in our knowledge are what make science so engaging.
Science is an intensely creative activity, but it differs in several important ways from other creative human activities such as art, music, or writing. Science is teleological. It has a goal toward which it strives—an ever-more accurate and all-encompassing understanding of the universe. Artistic endeavors do not share such a goal. Sculptors since the time of Michelangelo have not been striving to improve upon the accuracy of his sculpture of David. Writers have not been searching for four hundred years to find arrangements of words more universal than “A rose by any other name would smell as sweet.” Scientists have, however, been working steadily on Galileo’s ideas of motion, expanding their scope and improving their accuracy to build a more comprehensive model of motion through space and time.
Science and art also differ in regard to the role of the individual. Artistic creativity largely expresses one person’s vision. Contrary to some caricatures, science is a highly social activity. Scientists have lots of conferences, journals, websites, and coffee hours—they are constantly talking and writing, exchanging ideas, collaborating and competing to add another small tile to the mosaic of scientific understanding. It is from the conventions of this social web that the self-correcting nature of science emerges.
At any given moment, many “scientific” ideas are wrong. But as the last 400 years have shown, science as a whole has made tremendous progress. The reason for this progress is that wrong ideas in science never triumph in the end. Nature is always standing by as the arbiter and, while the aether may have survived for more than two millennia, as soon as the physical and mathematical tools were in place to measure its properties, its absence was readily discovered and accepted.
The enterprise of science has developed several habits and techniques for enhancing the pace of correcting false ideas. Perhaps foremost among these is skepticism. Although many people regard skepticism rather negatively, it is a scientist’s best quality. Indeed, it is essential to be skeptical of one’s data, always to look for ways in which a measurement might be biased or confounded by some external effect. It is even more essential to be skeptical of one’s own models and to recognize them as temporary approximations that are likely to be superseded by better descriptions of Nature. When one’s data agree with one’s model, euphoria must be tempered by a thorough and skeptical critique of both.
When the results of one’s experiment or observation are ready for public display, community skepticism springs into action. The description of one’s work in a scientific publication must provide enough detail to allow other scientists to reproduce one’s results—they’re skeptical and want to make sure you did it right. The published article itself is the result of a formal skeptical review by one or more referees chosen by journal editors for their expertise in a certain field and their objective, critical, skeptical approach. These days, instant publication of results via the Internet often precedes formal publication in a journal, exposing the author to dozens or hundreds of unsolicited, skeptical reviews from those who scan new postings each morning.
All this skepticism screens out a lot of nonsense right at the start. Furthermore, it optimizes the ability of both the original author and other scientists to root out errors and advance understanding. The constant communication through both formal and informal means rapidly disseminates new ideas so they can be woven quickly into the fabric of our current models, offering further opportunities to find inconsistencies and to eliminate them. This highly social enterprise with this highly skeptical ethos is central to the rapid growth of scientific understanding in the modern era.
My celebration of skepticism, emphasis on falsifiability, and insistence on the temporary nature of models should not be misconstrued as supporting the popular notion that science consists of an endless series of eureka moments. The news media are committed devotees of this false view. Each week, the science section of the New York Times needs half a dozen “news” stories, and if they can use words like “stunning,” “revolutionary,” and “theory overturned” in the headlines, so much the better. Scientists are complicit in this misrepresentation, all too easily using phrases such as “breakthrough,” “astonishing,” and the like, not only when talking to reporters but even when writing grant proposals and journal articles. Some philosophers of science share the blame by concentrating their studies on “paradigm shifts” and “scientific revolutions.” Science isn’t really like that.
Science is much more like building a cathedral than blowing one up
Science is much more like building a cathedral than blowing one up. Thousands of hands place the stones, weave the tapestries, tile the frescos, and assemble the stained-glass windows. Occasionally, a new idea might require the disassembly of some work already completed—invention of the flying buttress allowed the walls to go higher, so a new roof was needed. Very infrequently, on timescales typically measured in centuries, a genuinely new conception of the cathedral’s architecture emerges. While a major supporting wall or facade may need to be removed, we use many of the stones again, rehang some of the old tapestries, and always enclose most of the old building within the new. Our cathedral gets larger and ever more ecumenical, drawing a greater swath of the universe within its doors as the weaving, the tiling, and the stonemasonry goes on. It is extraordinarily gratifying and important work.
Examples of Bad Science
Vested corporate interests and the politicians who serve them, biblical literalists, and misguided consumers are not the only generators and consumers of bad science. People who wear the mantle of science but who have had little experience in scientific research—most notably medical clinicians—and full-fledged research scientists themselves also produce bad science.
The largest amount of bad science being produced today—and probably that with the largest impact—comes from small clinical and preclinical studies (e.g., the use of mouse models or cancer cell cultures) in hospitals, universities, and medical schools. The quality of this work is compromised by small sample sizes, poor experimental designs, and weak statistical reasoning; add a healthy dose of publication bias and the results are truly appalling.
Dr. C. Glenn Begley was head of global cancer research at the biotech company Amgen for a decade. Over that period, he selected fifty-three articles with supposedly “landmark” status (major results, trumpeted as such by the journals that published them and the media that reported on them) and assembled an Amgen team to attempt to reproduce the results. They found that forty-seven of the fifty-three studies (89 percent!) were irreproducible—the results were simply wrong. More disturbingly, when the Amgen team contacted the papers’ original authors in an attempt to understand the discrepancies, some would cooperate only on the condition that the Amgen scientists sign confidentiality agreements forbidding them to disclose any data that cast doubt on the original findings, thus assuring that their phony results would stand in perpetuity in the scientific literature.
A year earlier, a team at the big pharmaceutical firm Bayer AG attempted to replicate forty-seven different cancer study results and found more than three-quarters of them to be irreproducible. The article that reported this result was appropriately titled “Believe It or Not.”
The vast majority of these articles are not examples of deliberate fraud, however. As Begley and Lee Ellis state:
The academic system and peer-review process tolerates and perhaps even inadvertently encourages such conduct. To obtain funding, a job, promotion or tenure, researchers need a strong publication record, often including a first-authored high-impact publication. Journal editors, reviewers and grant-review committees often look for a scientific finding that is simple, clear and complete—a “perfect” story. It is therefore tempting for investigators to submit selected data sets for publication, or even to massage data to fit the underlying hypothesis.
The conclusion is inescapable: this work represents bad science. It fails the values of skepticism and disinterest, as well as the authors’ responsibility to attempt to falsify their results. Indeed, I would go so far as to say it is not science, and the journal editors and funding agencies and faculty committees that allow it to persist are simply feeding the antiscientific and anti-intellectual tendencies of society to the peril of us all.
The systematic mess discussed above is, we are told, largely a matter of individual misunderstanding of what true scientific habits of mind entail and the social pressures of the academic system. Worse still is outright fraud. Often attributed to the same academic pressures such as the need for jobs, grants, and promotion to tenure, there is little doubt that the number of incidents of outright fraudulent publications is growing.
A study by C.F. Fang, R.G. Steen and A. Casadevall in the Proceedings of the National Academy of Sciences examined all 2,047 papers that had been retracted after being published and listed in the National Library of Medicine’s MEDLINE database before May 3, 2012. They found 436 cases of errors, some caught by the original authors and some caught by others; given that the database contained roughly twenty million publications at the time, this is an error rate most occupations would envy (although there is no reason to think that every error has been caught, given the reproducibility studies cited previously). More disturbingly, they found 201 cases of plagiarism and 291 cases of duplicate publications (sometimes called self-plagiarism); both instances represent unacceptable departures from ethical norms in the academic world (scientific or otherwise).The largest fraction of papers retracted, however—888 (43.4 percent)—were because of outright fraud or suspected fraud. Based on these data, the authors claimed this percentage has risen ten-fold since 1975.
A separate study, “A Comprehensive Survey of Retracted Articles from the Scholarly Literature,” also published in 2012, went beyond the biomedical sciences to include forty-two major bibliographic databases across academic fields. The authors of this study documented 4,449 retracted articles originally published between 1928 and 2011. The headline-grabbing nugget from the study’s abstract was that the number of articles retracted per year had increased by a factor of 19.06 (a number clearly reported to too many significant digits) between 2001 and 2010; the increase dropped to a factor of eleven when repeat offenders and the overall growth in papers published were included in the calculation.
The impact of scientific fraud on the overall scientific enterprise is miniscule: its incidence is very low
While all incidents of scientific fraud are abhorrent, none of the media coverage noted the tiny fraction of papers involved and the even smaller fraction of scientists. Repeat offenders are responsible for a significant fraction of retractions; for example, most of the 123 papers retracted from Acta Crytallographica Section E were attributable to two authors. And of the 76,644 articles published in the Journal of the American Chemical Society between 1980 and 2011, twenty-four (0.03 percent) were retracted; the numbers for Applied Physics Letters were even better: 83,838 papers, of which fifteen (0.018 percent) were withdrawn (not all for fraud). If all our social enterprises had a failure rate of less than 0.1 percent and upheld ethical norms 99.982 percent of the time, it is likely the world would be a better place.
The impact of scientific fraud (which again is different from the bad science described previously) on the overall scientific enterprise is miniscule: its incidence is very low, much of it is found in relatively obscure journals and uncited papers, and the self-correcting aspects of the enterprise are good at rooting it out. Much greater damage can be done to society at large, however; an example, the infamous Andrew Wakefield paper on the measles, mumps, and rubella vaccine, will be explored further. Furthermore, fraud damages the reputation of the scientific enterprise. It can be used as an excuse to dismiss any inconvenient scientific results and can lead to a decrease in support for public research funding and a general rise in anti-science attitudes.