In This Section      

Peering more closely at peer review

Research journals are examining the weak points in the traditional peer review model and experimenting with changes to the conventional way of doing things

March 2011--If there were a Good Housekeeping Seal of Approval in science, it would be peer review. Readers of peer-reviewed journals can take comfort in knowing that science reported there has garnered the endorsement of a panel of experts renowned for their scholarship in the field. Peer review connotes credibility: It’s the imprimatur of good science.

Or so the thinking goes.

But peer review also receives its share of criticism: It’s sluggish—papers can take a year or more to pass through the reviewing gauntlet before getting published. It’s a “clubby” system, favoring an insular network of reviewers. And, most seriously, it’s not flawless. Bad studies—poorly designed or incorrectly analyzed—sometimes pass peer review and get published, even in the most prestigious scientific journals.

Scientists have debated the strengths and weaknesses of peer review for decades. Recently, the issue surfaced again following the publication of a study on extrasensory perception (ESP). The authors, who reported their work in the highly regarded Journal of Personality and Social Psychology, claimed to have demonstrated evidence of the ability for volunteers to sense future events. Critics responded with sharp questions about the validity of the science and the caliber of the journal’s peer reviewers.

In the meantime, another study in the biomedical research field attracted less attention but raised similar questions. In this case, scientists analyzed the genomes of more than 1,000 centenarians to look for genetic variations that might accompany exceptional longevity. In an article published in the November 11, 2010, Science, the researchers concluded that they had identified genetic signatures of long life. However, the next day, in response to community criticism, Science published an “Editorial Expression of Concern” noting technical problems with the study’s methodology.

When faulty studies appear, it is tempting to blame peer review, says Aravinda Chakravarti, professor of medicine, pediatrics and molecular biology and genetics at Johns Hopkins School of Medicine, who is the co-editor in chief of Genome Research and serves on the editorial boards of five other journals. However, few scientists would suggest abandoning peer review. “For the most part, we have a system that works and that works well in the majority of cases,” says Chakravarti. “But there is room for improvement.”

Is overt peer review the answer?

So journal editors and scientists, including some at Johns Hopkins, are tweaking the conventional peer review model in different ways. Their efforts have found varying degrees of success.

One idea is to broaden the size of the reviewing pool to increase the number of eyes that see a paper prior to publication. With the global reach of the Internet, the conventional protocol for peer review becomes obsolete, the thinking goes.

Such a practice is commonplace in physics, says Jonathan Bagger, a Johns Hopkins professor of physics and astronomy. Many in his field post their studies early on an electronic server called arXiv (“archive”). Anyone can read the articles prior to publication. “I always publish first on arXiv,” says Bagger. “Then I’ll wait a couple of months, get people’s comments and questions, address those, and submit the paper to a journal. I use it as a source of community peer review.”

In 2006, the journal Nature tested a similar approach. During a four-month period, it offered submitting authors the chance to participate in an open peer examination while their papers also went through the traditional review process. Those who accepted would have their papers posted online where any reader could post comments. Few did. “Only 5 percent of authors opted to participate,” says executive editor Veronique Kiermer. “Few comments were received, most of them not technically substantive.”

Nature recently began a new initiative of post-publication review—readers may comment on primary research papers appearing in the publication. Again, the response has been low. 

Some say the biomedical community is simply too protective of its findings for open peer review to take hold. In addition, what works for physics may not be practicable for biomedical research, observes Bagger. “The physics community is very small,” he says. “And in my field, there is very little patentable work done.” 

At least one journal has studied whether training peer reviewers could improve the quality of review. Editors at the British Medical Journal inserted 14 errors involving methodology (nine major errors and five minor errors) into three papers describing clinical study results. They gave the papers to three sets of reviewers, two that had received training in spotting such errors and one group that had not received such training. The results: All groups caught only a minority of errors, and the group receiving training spotted only slightly more than the control group (three errors versus two). The journal’s editors concluded that short stints of training do not improve the quality of peer review, although longer training programs might.

Seeking better reviewers

But rather than find ways to improve the skills of reviewers after they’ve been recruited, it might be better to recruit reviewers better qualified for the task at hand. Doing that may require reaching outside of the usual pool of candidates, says Chakravarti. “We tend to use reviewers we know. We trust their judgment. But that’s how old boy networks start. Sometimes those might not be the best people.”

The Science article on genetics and aging may illustrate his point. The study turned out to have a number of problems, says Joel Bader, associate professor of biomedical engineering at Johns Hopkins who is on the editorial boards of the open-access journals PLoS One and PLoS Computational Biology. The study had a major design flaw in the genotyping setup, says Bader. In addition, its data analysis was done incorrectly. Bader explains that a genetic analysis of this sort should show that the results have a probability of just one out of one hundred million of occurring by chance.  “These scientists had a more lenient threshold,” he says. “If you work through the math, their discoveries are what you’d expect from chance.” Bader suspects that the reviewers did not have the statistical expertise to spot the error. 
The ESP paper, says Bader, may have suffered from a similar problem. The study involved 10 tests. Eight did not show evidence of ESP, one showed marginal evidence, and one was favorable. But a rigorous statistical analysis would show that such a result could occur by chance, according to Bader. Reviewers should have required the researchers to replicate their results, he says.

PLoS journals may do a better job of selecting highly qualified reviewers, Bader adds. Designed to be scientist-run publications, the journals have editorial boards made up of scientists who are active in their fields. "The editors are able to know whether a paper needs to be reviewed by a statistician or if the results are so good that they're too good to be true,” says Bader.

The lesson from both the genetics of aging and the ESP studies is that statistical methods have become more sophisticated and editors should take that into account when selecting reviewers, say Bader and Chakravarti. At least one journal, Annals of Internal Medicine, has a team of biostatisticians reviewing each submission.

That’s a costly proposition that not all journals could afford. But there is something the science community could do without compounding costs, notes Chakravarti: volunteer more. Any scientist who receives grant funding should, in return, agree to serve as a reviewer when asked, he says. “It is an important part of being a scientist.”

--Melissa Hendricks