Unintended Consequences of Journal Ranking
I just read this paper, written by Björn Brembs and Marcus Munafò and entitled “Deep Impact: Unintended consequences of journal rank”. It was recently posted on the Computer Science arXiv (h/t Jordan Ellenberg).
I’ll give you a rundown on what it says, but first I want to applaud the fact that it was written in the first place. We need more studies like this, which examine the feedback loop of modeling at a societal level. Indeed this should be an emerging scientific or statistical field of study in its own right, considering how many models are being set up and deployed on the general public.
Here’s the abstract:
Much has been said about the increasing bureaucracy in science, stifling innovation, hampering the creativity of researchers and incentivizing misconduct, even outright fraud. Many anecdotes have been recounted, observations described and conclusions drawn about the negative impact of impact assessment on scientists and science. However, few of these accounts have drawn their conclusions from data, and those that have typically relied on a few studies. In this review, we present the most recent and pertinent data on the consequences that our current scholarly communication system has had on various measures of scientific quality (such as utility/citations, methodological soundness, expert ratings and retractions). These data confirm previous suspicions: using journal rank as an assessment tool is bad scientific practice. Moreover, the data lead us to argue that any journal rank (not only the currently-favored Impact Factor) would have this negative impact. Therefore, we suggest that abandoning journals altogether, in favor of a library-based scholarly communication system, will ultimately be necessary. This new system will use modern information technology to vastly improve the filter, sort and discovery function of the current journal system.
The key points in the paper are as follows:
- There’s a growing importance of science and trust in science
- There’s also a growing rate (x20 from 2000 to 2010) of retractions, with scientific misconduct cases growing even faster to become the majority of retractions (to an overall rate of 0.02% of published papers)
- There’s a larger and growing “publication bias” problem – in other words, an increasing unreliability of published findings
- One problem: initial “strong effects” get published in high-ranking journal, but subsequent “weak results” (which are probably more reasonable) are published in low-ranking journals
- The formal “Impact Factor” (IF) metric for rank is highly correlated to “journal rank”, defined below.
- There’s a higher incidence of retraction in high-ranking (measured through “high IF”) journals.
- “A meta-analysis of genetic association studies provides evidence that the extent to which a study over-estimates the likely true effect size is positively correlated with the IF of the journal in which it is published”
- Can the higher retraction error in high-rank journal be explained by higher visibility of those journals? They think not. Journal rank is bad predictor for future citations for example. [mathbabe inserts her opinion: this part needs more argument.]
- “…only the most highly selective journals such as Nature and Science come out ahead over unselective preprint repositories such as ArXiv and RePEc”
- Are there other measures of excellence that would correlate with IF? Methodological soundness? Reproducibility? No: “In fact, the level of reproducibility was so low that no relationship between journal rank and reproducibility could be detected.
- More about Impact Factor: The IF is a metric for the number of citations to articles in a journal (the numerator), normalized by the number of articles in that journal (the denominator). Sounds good! But:
- For a given journal, IF is not calculated but is negotiated – the publisher can (and does) exclude certain articles (but not citations). Even retroactively!
- The IF is also not reproducible – errors are found and left unexplained.
- Finally, IF is likely skewed by the fat-tailedness of citations (certain articles get lots, most get few). Wouldn’t a more robust measure be given by the median?
- Journal rank is a weak to moderate predictor of scientific impact
- Journal rank is a moderate to strong predictor of both intentional and unintentional scientific unreliability
- Journal rank is expensive, delays science and frustrates researchers
- Journal rank as established by IF violates even the most basic scientific standards, but predicts subjective judgments of journal quality
- “IF generates an illusion of exclusivity and prestige based on an assumption that it will predict subsequent impact, which is not supported by empirical data.”
- “Systemic pressures on the author, rather than increased scrutiny on the part of the reader, inflate the unreliability of much scientific research. Without reform of our publication system, the incentives associated with increased pressure to publish in high-ranking journals will continue to encourage scientiststo be less cautious in their conclusions (or worse), in an attempt to market their research to the top journals.”
- “It is conceivable that, for the last few decades, research institutions world-wide may have been hiring and promoting scientists who excel at marketing their work to top journals, but who are not necessarily equally good at conducting their research. Conversely, these institutions may have purged excellent scientists from their ranks, whose marketing skills did not meet institutional requirements. If this interpretation of the data is correct, we now have a generation of excellent marketers (possibly, but not necessarily also excellent scientists) as the leading figures of the scientific enterprise, constituting another potentially major contributing factor to the rise in retractions. This generation is now in charge of training the next generation of scientists, with all the foreseeable consequences for the reliability of scientific publications in the future.
The authors suggest that we need a new kind of publishing platform. I wonder what they’d think of the Episciences Project.