Fixing Our Broken Peer Review Process

The peer review process has some serious problems, but there are ways that we can fix it

Jul 29, 2022

In the latest issue of “Works in Progress” magazine, Saloni Dattani has an interesting article about how to improve the peer review process in academic research. She identifies several problems that have received attention in recent years, including instances of academic fraud that have been uncovered, as well as long publication and review times that seem to be slowing down research. She suggests a few ways to improve the process, including making it easier to submit to multiple journals at a time and leveraging crowdsourcing techniques on the internet or social media to solicit more and faster feedback on articles.

I think Saloni’s criticisms are valid and her solutions could help too. Ben Southwood also makes some excellent points in a response post to Saloni’s. I thought I would offer some of my own thoughts on this topic, which mostly center around how to get good ideas to the intellectual forefront, when the not-so-good ones get entrenched.

Good ideas don’t necessarily triumph

I should probably start this discussion by saying I do think there is a constructive role for peer review to play in academic research. For a few years, I oversaw the Mercatus Center’s research pipeline on regulatory research. I probably had a few hundred papers pass across my desk during that time and I read almost all of them, along with the corresponding reviewer feedback, and saw how papers evolved across drafts. I saw numerous instances where drafts would come in sloppy and incomplete, but after a round of reviewer feedback they would be transformed into publishable academic papers. Constructive overarching feedback, and careful line by line criticisms, can be immensely valuable. There was a time when I would not have put out anything publicly before receiving such feedback. Peer review won’t capture all mistakes, but if an author failed to consider something important, the reviewer will often identify it.

In my experience, a problem arises when a reviewer encounters something that is outside their existing domain of experience. Peer review is fundamentally a process that reinforces an existing consensus. Journals are worried about papers getting through that might harm their reputations. Reviewers are there to make sure authors don’t say anything that violates the consensus. That way the reputation of the journal is more likely to remain intact. Note that even heterodox journals have a version of this. They develop their own consensus within their heterodox school, which is then enforced by peer reviewers.

So long as the consensus is right, there’s no problem. But what if the consensus is wrong? Reviewers then reject papers on the basis of them not conforming with the prevailing view, but these papers might have constructive contributions to make. I was trying to think of specific cases of this phenomenon, so below are a few examples that came to mind:

There is a literature related to the “cost effectiveness of health interventions.” This is the research that deals with health metrics like “quality-adjusted life years” and “disability-adjusted life years.” I don’t want to get into all the technical aspects of this research, but there are probably thousands of articles in this literature and, frankly, it is not very clear what these scholars are measuring. What seems to have happened is a few decades ago scholars came up with certain methods. Those methods were adopted for analytical convenience, not for much of a theoretical reason. Research uncovered some (but probably not all) of the problems with the methods. No better alternatives seemed to be around. So scholars continued applying the methods despite the criticisms. Governments then picked up on the methods and began employing them. And now the practices have become entrenched, despite the fact that it’s still not that clear what the research is measuring.
A few years back, I also became interested in cost-benefit analysis. The theoretical underpinnings of cost-benefit analysis in particular interested me a great deal, so I began exploring the literatures on the “social discount rate” and the “value of a statistical life.” Despite the fact that regulations costing billions of dollars are routinely justified using the value of a statistical life, I was shocked at how sparse the early literature on it was relative to the literature on discounting. Some of the most influential research from the 1980s and earlier, which contributed to the government starting to use the metric for policy purposes, has very little theory, or at least you could say the theory is unsettled. The papers that do have some theory are not convincing (they basically assume that whatever some individual or group wants should be imposed on the rest of society).
The linear-no-threshold dose response model in risk assessment is another example where a practice got adopted, mostly for convenience although there might also have been political motivations. Academics saw the practice gets through peer review, so they continued using it. Government agencies endorsed the model. That’s despite the fact that there was never really any theoretical or empirical basis for it. A model was needed so this one got selected. Pretty soon the practice got entrenched, and it was hard to switch to another set of practices.

In each of these literatures, if you were to try to use alternative methods, the reviewers would most likely shut you down. This is true despite there being alternatives out there that could be used. So long as you stick with the practices that have become conventions in these communities, you’ll get published. Try to deviate and you’ll get rejected.

To be clear, I am much more concerned about cases we might not be aware of than the examples highlighted here. At least in these cases, there’s some sense in these literatures that the metrics aren’t perfect. It’s the cases where we think the prevailing view is correct, but it’s actually wrong, that truly scare me.

Peer review is not a corrective

In economics, it is difficult to write an article that is a just a critique, based on a logical argument, of methods economists use. Most likely the paper won’t even get reviewed; it will be desk rejected. Reviewers or the editor will say, “You have not said anything new here.” But this is not because nothing new has been said. Instead, it is because there is no model, no regression, no Monte Carlo simulation, no Table 5-1, no Appendix A-3, and so on. A logical argument that starts from premises and reaches a conclusion is not considered scientific. I’ve written entire empirical papers just so I could get a few paragraphs of logical argument into a published article. It doesn’t seem like this should be the way published research works.

In this sense, Ben Southwood is exactly right when he says that peer review encourages research to be “novel empirically but not theoretically.” Entire literatures in economics are comprised of articles full of tables, math and equations. They look like science. Journalists, even other economists, assume the research must be right because it’s hard to get through the peer review process, and because there are a lot of numbers to back things up. But logically, what is being done doesn’t always add up. Problematic logic continues based on path dependence, and peer review reinforces this path dependence because peer review reinforces whatever the prevailing practices happen to be. If the current paradigm is wrong, peer review doesn’t offer a way to a better paradigm.

A failure of logic

I recently finished reading John Stuart Mill’s autobiography, and one of the things I was struck by in the book was a statement he made about how one of the most important lessons he ever learned was learning elementary logic. I’ve had the exact same thought myself. A course I took while getting my PhD required a review of the principles of logic. It provided some of the most useful material I’ve ever been exposed to in my life.

It took me several years to understand why “intellectuals” might embrace certain arguments that fail from a logical perspective. What I came to realize—and this is my theory, I recognize I am generalizing somewhat—is that the kind of thinking skills academics excel at are not the kinds of skills that involve looking at a position, deconstructing it into pieces, and then going back and finding the origins of these assumptions and questioning them. Rather, what academics excel at is reading something in a book and then applying a principle they learned in one context to another context that is similar. Consider that virtually all academics were star students. Their teachers fed them information and then they excelled at repeating that information back to their teachers on exams.

I have a tremendous amount of respect for academics. The papers published in the “Quarterly Journal of Economics” are masterpieces. I can’t do what those economists do. But academics are also not good at some things, and I’ve come to believe that logic is one of those things.

We certainly need the kinds of engineering-type skills academics have. Those skills work such that someone learns a procedure and then goes out in the world and applies it. If we didn’t have people like that, we wouldn’t have buildings, bridges, cell phones, or so many other practically useful aspects of modern life. But we also need critical reasoning skills. Here, not only would I say that academics have no particular advantage, but they may even suffer a disability relative to other fields. Business people are far better at critical reasoning than academics are. Business people are constantly being thrown new challenges to respond to on the fly, often without any pre-history to draw upon about how to solve the particular problem. They just have to figure out a solution, often on a deadline. This is a skillset academics have difficulty with, and it’s largely institutional factors and the way we educate people that has led to this situation, in my view.

Peer review reinforces existing hierarchies

Although peer review is supposed to be anonymous in many cases, but very often reviewers know who authors are and authors know who reviewers are. Intellectual communities tend not to be very big. Many journals don’t have time to figure out who is an expert on a topic, given the volume of papers they receive, so they ask authors to recommend the reviewers for them.

For professional career reasons, it makes sense to play nice with reviewers. Professors need publications to get tenure. Antagonizing reviewers could hurt future publication prospects, since you might get the same reviewer again at some point. A reviewer might be someone who could offer you a position in their department, an editorial slot at a journal, or speaking and publishing opportunities. Reviewers know that if they help guide a paper to publication, the author might be a reviewer on one of their papers down the line and reciprocate the favor.

All of these incentives work towards reviewers and authors working together. It means the reviewers and authors converge towards a set of practices where everyone feels it’s “safe” to let the paper be approved. This tends to support existing hierarchies and practices. It’s the opposite of a “challenge culture” where predominant practices are routinely put to the test. The older generation of researchers’ ideas also tend to get venerated, simply by virtue of them being in influential positions at universities and journals. Support the ideas of those at the pinnacle of the profession and your articles are likely to get published. Challenge them, and you will get fewer publications, denied tenure, etc.

Academia is not like a business, and that’s bad

Here’s a silly metaphor for how I tend to think about the difference between academia and the private sector. When I was in my early 20s I used to work as a “bar back” at various bars in New York City. My job was to stock coolers full of beer, replenish ice when it ran out, wash glasses, and do other random tasks that were needed around the bar. If there was a problem like the toilet got clogged, the bathroom needed paper towels, or (God forbid) someone threw up on the floor, it was my job to address the problem.

If I just did nothing when there was a problem, bad things would happen, as these examples make clear. There was simply no option to procrastinate and let the problem fester until it had grown completely out of control. If I behaved that way, I would get fired and the customers would never come back.

Academia often works the exact opposite way. If you point out that the metaphorical toilet is clogged or the bathroom is out of paper towels, people will get angry at you. You won’t get published. You won’t get tenure. You won’t be made assistant editor of the journal. You won’t be given speaking engagements or notable political positions. Instead, you’ll get punished for pointing out what in some instances is obvious—that the toilet is clogged. Pretty soon the toilet has overflowed and there’s no option but to call a plumber.

Recommendations

Enough of me on my soapbox complaining about academia and peer review. What can be done about this? Here are a few suggestions:

Peer reviewers should not be gate keepers. A single editor should have the discretion to allow or not allow a paper to be published. The editor should, if they want, rely on feedback from reviewers in this process. But authors should be allowed to follow or ignore the advice of reviewers at their discretion. It should not be controversial to say some version of the following to an editor: “I don’t like this idea and I’m not going to do it.” It’s their paper after all. This would also help address the issue of papers getting long and longer over time. Reviewers can play a role but shouldn’t get that authority to veto a paper unless an editor allows it.

To relieve the workload on editors, it could make sense for them to get columns that they oversee in a journal. Within their column, they can publish whatever they want by whoever they want following any process they want. The editor is ultimately responsible for the quality of the research. If second-rate research comes out in their column, their reputation takes a hit. Ben correctly points out that innovation was higher in the past when peer review standards were laxer. I think it would be fine to relax peer review standards but hold specific individuals more accountable when poor (or fraudulent) research comes out on their watch.

The concept of the academic conference is outdated. There is very little challenging of ideas going on, and very little useful feedback is obtained at presentations. These conferences are more about networking. Perhaps that’s a good enough reason to have conferences, but I think there’s a missed opportunity here. I’ve been part of “manuscript conferences” where authors working on a book get feedback from a group. I thought these were very productive. Maybe there’s a way to extend that idea to academic conferences, where people attend fewer sessions, but everyone has read the draft article before hand and is ready to provide deep-dive feedback.

Ultimately, I see the problems with the publication process in academia as institutional. If there wasn’t this “play along to get along” culture, if we instead had a challenge culture, the existing peer review process might work much better, even if it wasn’t modified. Eliminating tenure and eliminating publication requirements for professors would both work in this direction. Then professors could spend more time searching for the truth, and less time kissing up to elites in their discipline.

There needs to be more experimentation in how we educate people. Students should be thrown into new environments where they are outside their comfort zone and then need to figure out solutions to problems they have not been exposed to before. The current model of presenting students with a problem, walking them through the solution, and then asking for the solution again on a test leads to dull, uncreative thinkers. More required study abroad programs, more online learning, and more self-taught independent study courses are a few ideas. I also think courses in philosophy, logic, and the humanities in general help train people to be better thinkers.

Conclusion

Peer review is most useful in cases where papers are on uncontroversial topics and apply standard methods that are well supported and logical. It works least well when papers challenge an existing consensus and the consensus is entrenched and unsupported by reason or evidence.

The problems with peer review are deep-rooted and ultimately lie in institutional incentives and how we educate people. Tenure and “publish or perish” contribute to reinforcing existing hierarchies at the expense of the pursuit of truth. We also need to train people who can think creatively and come up with solutions to problems that don’t have pre-existing solutions. This can be an emotionally taxing way of educating people, but it would generate people who know how to think. More than anything, this is what the world needs now to solve our most pressing problems.

Literary Economist