Abstract
Background: A number of medical journals have developed policies for accelerated publication of articles judged by the authors, the editors or the peer reviewers to be of special importance. However, the validity of these judgements is unknown. We therefore compared the importance of articles published on a “fast track” with those published in the usual way.
Methods: We identified 12 “case” articles — 6 articles from the New England Journal of Medicine that were prereleased on the journal's Web site before publication in print and 6 “fast-tracked” articles from The Lancet. We then identified 12 “control” articles matched to the case articles according to journal, disease or procedure of focus, theme area and year of publication. Forty-two general internists rated the articles, using 10-point scales, on dimensions addressing the articles' importance, ease of applicability and impact on health outcomes.
Results: For each dimension, the mean score for the case articles was significantly higher than the mean score for the control articles: importance to clinical practice 7.6 v. 7.1 respectively (p = 0.001), importance from a public health perspective 6.5 v. 6.0 (p < 0.001), contribution to advancement of medical knowledge 6.2 v. 5.8 (p < 0.001), ease of applicability in practice 7.0 v. 6.5 (p < 0.001), potential impact on health outcomes 6.5 v. 5.9 (p < 0.001). Despite these general findings, in 5 of the 12 matched pairs of articles the control article had a higher mean score than the case article across all the dimensions.
Interpretation: The accelerated publication practices of 2 leading medical journals targeted articles that, on average, had slightly higher importance scores than similar articles published in the usual way. However, our finding of higher importance scores for control articles in 5 of the 12 matched pairs shows that current journal practices for selecting articles for expedited publication are inconsistent.
A number of medical journals have developed policies for accelerated publication of articles describing findings that are judged by the authors, the editors or the peer reviewers to be particularly important and deserving of rapid dissemination. In the case of the New England Journal of Medicine, accepted articles that have “immediate clinical implications”1,2 are occasionally prereleased on the journal's Web site (www.nejm.org) before their official publication date, and an early press release is issued to the media. A recent high-profile example of a prereleased article is that of the RALES study that assessed the efficacy of spironolactone for congestive heart failure.2,3
The Lancet,4,5,6 the British Medical Journal 7 and CMAJ 8 are other medical journals that have adopted mechanisms for occasionally accelerating the peer review and printing process for articles judged to present especially important research findings needing urgent dissemination. In these journals the expedited process is referred to as “fast-track” publication. A number of other high-profile journals, including the Journal of the American Medical Association,9 Science10 and Nature,11 have also adopted mechanisms for expedited publication, with Nature recently prereleasing on its Web site 2 articles on the molecular biology of anthrax infections.12,13
Despite the existence, and increasing profile, of these journal publication policies, no study has formally assessed the importance, methodological quality and general visibility of articles published in an accelerated manner relative to articles published in the usual manner. In this article we address these questions by asking a group of physicians to rate the importance, quality and visibility of a selection of prereleased articles from the New England Journal of Medicine and fast-tracked articles from The Lancet.
Methods
We used the general framework of a case–control study to address these research questions. We used “case” to refer to articles that were either prereleased on the journal's Web site or fast-tracked and “control” to refer to articles published in the usual manner.
Identification of articles
For case articles from the New England Journal of Medicine, we wrote to the journal's editorial office in late September 1999 to obtain a full list of articles that had been prereleased online since the journal established its Web site. Among these 11 articles were 2 editorials,14,15 which were excluded because we sought to study only original research articles. One research article16 was excluded because it was prereleased online only after a news agency broke the journal's news embargo. Of the remaining 8 articles 3, on new treatments for cervical cancer,17,18,19 were prereleased together. To avoid burdening study participants with 3 case articles on the same theme, we randomly selected only 1 of these17 for inclusion in our study. We were thus left with 6 case articles from the New England Journal of Medicine.
For case studies from The Lancet, we selected 6 fast-tracked articles published between 1997 and 1999 in order to have the same number of case articles from both journals. We randomly selected individual issues from a listing of weekly issues of The Lancet published since 1996 and then scanned each issue for fast-tracked articles. If the issue contained no fast-tracked articles another issue was selected. If 1 fast-tracked article was published in the selected issue, the article was included in our study. When more than 1 fast-tracked article was published in an issue, we wrote the citations of each article on folded pieces of paper and randomly selected only 1 article per issue.
For control articles, one of us (W.A.G.) searched MEDLINE to identify 12 articles matched to the 12 case articles by journal of publication, condition or treatment of focus, main theme of the article (e.g., therapy, prevention, adverse effects) and, if possible, calendar year of publication. In most instances, only 1 candidate article would meet all of these criteria. When 2 or more potential control articles were identified, we randomly selected 1 for study. If no matching articles were found in the same year of publication, we broadened the fourth matching criterion to allow articles published within ±1 calendar year. Although there are statistical advantages to selecting more than 1 control per case in case–control studies, we limited the number of control articles to 1 per case because we wanted to limit the total number of articles to be reviewed and because we frequently found only 1 control article that matched the case article well.
Rating scale
We developed an importance rating instrument that incorporated many of the dimensions addressed by Lawrence and associates20 in their recent work assessing the importance of published articles. Our instrument asked study participants (described in the next section) to rate articles on a 10-point scale that addressed each of the following 6 dimensions: importance to their own clinical practice (i.e., relevance); importance to clinical practice in general (allowing for the possibility that some respondents would not care for patients with the condition in question); importance from a local, national or international public health perspective; importance to the general advancement of our collective medical knowledge (i.e., knowledge context); ease with which the new information described in the article can be applied in daily practice (i.e., ease of applicability); and the impact that the new information described in the article is likely to have on the health outcomes of those affected by, or at risk for, the disease or condition addressed by the article. To assess the articles' general “visibility,” the instrument also asked respondents to indicate if they either had heard of the article in question or had read it.
Potential responses for the first 4 dimensions ranged from “unimportant” (0) to “extremely important” (10). For the question on ease of applicability, potential responses ranged from “very difficult” (0) to “very easy” (10). For the question on impact on health outcomes, potential responses ranged from “no impact” (0) to “huge impact” (10).
Rating process
General internists affiliated with 4 academic institutions (University of Calgary, University of Alberta, Dalhousie University and the University of Lausanne) were invited to participate in this study as raters of article importance. We considered general internists to have a broad enough perspective to rate the importance of the wide variety of articles that we compiled. A total of 76 general internists were sent an email with a brief message inviting them to participate in a study “assessing the importance of articles recently published in the New England Journal of Medicine and The Lancet”; 42 agreed to participate. A priori, we projected a need for 43 participants to allow 90% power to detect a difference in scores of 1 point, assuming a standard deviation of 2 points. With 42 participants, we ended up having more than 90% power to detect 1-point differences because the standard deviations of scores were considerably less than 2 points.
Questionnaires and articles (24 in total) were mailed to the general internists who volunteered to participate in the study. Pairs of articles were presented to respondents consecutively in a pile, with the order of case and control articles alternating between pairs (i.e., case before control in one pair, control before case in the next). Aside from being aware that this study assessed the importance of published articles, the participants were blinded to the study's focus. Successful blinding was verified when we collected completed questionnaires: none of the respondents questioned knew that the study was assessing the issue of expedited publication.
As an ancillary assessment of each article's importance, we used the Science Citation Index to determine citation counts for each of the 24 articles. These citation counts were performed in December 1999.
Methodological quality
Two methodological reviewers (F.A.M. and J.B.W.) blinded to the study's objectives rated the methodological quality of the selected articles using a quality scale described by Downs and Black.21 The scale has established reliability and criterion validity.21 We chose this quality scale because it can be applied to some types of observational studies as well as to clinical trials. As originally described, the scale yields scores ranging from 0 to 31. We modified one of the scale's items, a complex 5-point question on study power, to a simpler 1-point item. For our study, therefore, potential scores ranged from 0 to 27. We were able to apply the scale to 9 of the 12 pairs of articles; the scale could not be applied to 3 pairs because one or both of the articles were case reports or case series, study types to which the scale is not applicable.
Analysis
We used a paired t-test to calculate the statistical significance of differences between mean scores for case and control articles. Our analyses assessed the differences in mean scores for each dimension, averaged across articles, as well as the differences in mean scores for each article, averaged across dimensions. Parametric statistical tests were appropriate because scores tended to be normally distributed.
Results
Table 1 lists the titles and key matching elements of the pairs of articles.3,17,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43 In most instances the case and matched control articles addressed very similar topics. However, in 2 instances the absence of other articles on the theme addressed by the case article necessitated a broadening of search themes to identify a matching control article. In the first instance (pair B) the case article from The Lancet was a randomized controlled trial of ultrasonographic surveillance versus early surgery for abdominal aortic aneurysms.24 Because of the absence of other Lancet articles on aortic aneurysms or aortic surgery, the search theme was broadened to vascular surgery in general. With this broader search, we identified a control article describing a randomized controlled trial of a vascular surgical procedure, transjugular-intrahepatic-portosystemic shunt, versus endoscopy plus propranolol for the prevention of variceal bleeding.25 In the second instance (pair J) the case article from the New England Journal of Medicine was a randomized controlled trial of neostigmine for the treatment of colonic pseudo-obstruction.39 We were unable to find a matching article on either bowel obstruction or neostigmine, so we had to broaden the search theme to capture all colorectal diseases. This broader search allowed us to identify an article describing a randomized controlled trial of the treatment of chronic anal fissure.40
Table 2 presents the characteristics of the general internists who rated the articles. A majority of the respondents had served as manuscript reviewers in the year before the study, with 14% having done so at least 5 times. Aside from being general internists, 55% of the respondents declared focused interest or special expertise in a variety of clinical and academic areas, such as hypertension, clinical pharmacology, venous thromboembolic disease, clinical epidemiology and medical education.
Importance ratings
Table 3 presents the mean scores across all case and control articles for each of the dimensions assessed. For every dimension the case articles had a significantly higher mean score than did the control articles. The responding physicians often felt that the case and control articles were not of great importance to their personal practices (mean scores 3.3 and 2.9 respectively), but they did acknowledge the importance of the case and control articles to clinical practice in general (mean scores 7.6 and 7.1 respectively).
Table 4 presents the mean scores across all 6 dimensions for each of the case–control pairs of articles studied. This table therefore presents the mean rating for importance, applicability and impact on health outcomes combined for each article studied. Despite the fact that the mean scores were generally higher for the case articles than for the control articles in each dimension (Table 3), there were 5 instances (pairs A, C, D, H and K) in which the mean scores across dimensions were higher for the control article than for the case article (Table 4). The difference in favour of the control article was statistically significant in 1 instance (pair K, p = 0.004). Scores were higher for the case article in the remaining 7 pairs, and the difference in favour of the case article was statistically significant for 5 of these (pairs B, E, F, I and L, p < 0.05).
There was a greater tendency for the case articles than the control articles to have been previously heard of or read by the respondents. On average, the respondents indicated that they had previously heard of 4.4 (standard deviation [SD] 1.6) of the 12 case articles compared with 3.5 (SD 2.0) of the 12 control articles (p = 0.002). They indicated that they had previously read 2.9 (SD 1.5) of the case articles compared with 1.9 (SD 1.3) of the control articles (p < 0.001).
Citation counts
As of December 1999 the case and control articles had been cited in the medical literature a mean of 29.0 (SD 52.1) and 32.8 (SD 53.3) times respectively. However, the mean number of months from publication to the time of citation was not equivalent for the case and control articles (12.8 months v. 17.2 months). Taking time since publication into account, we calculated the mean number of citations per month (for each article and then averaged across articles) to be 1.8 for the case articles and 1.5 for the control articles (p = 0.44). For articles published in or before December 1998, the mean number of citations per month was 2.1 and 1.9 respectively (p = 0.84).
Methodological quality
The studies' methodological quality scores were equivalent for the 9 pairs of articles to which the 27-point rating scale could be applied. Reviewer 1 (F.A.M.) assigned mean scores of 18.9 (SD 4.6) and 20.8 (SD 3.5) for the case and control articles respectively (p = 0.34). Reviewer 2 (J.B.W.) assigned mean scores of 20.2 (SD 6.1) and 20.2 (SD 5.3) respectively (p = 1.0). Despite demonstrated reliability of the scale developed by Downs and Black,21 and general agreement between the 2 reviewers in the global finding of no differences in methodological quality between the 2 groups of articles, we found that the quality scoring system was only modestly reproducible for individual articles, with a Pearson correlation coefficient between reviewers of only 0.43 (p = 0.13).
Interpretation
Our findings indicate that the New England Journal of Medicine and The Lancet were successful in selecting articles for accelerated publication that were, on average, more important and applicable than the articles published in the usual manner. However, the differences were modest. Our results also demonstrated 5 instances in which the matched control article was considered to be of similar or greater importance than the expedited article.
An optimistic interpretation would be that current publication practices are supported by the modest differences in favour of articles given accelerated publication or dissemination. We suggest, in contrast, that our findings call into question current publication practices. Journals in some instances are not expediting the publication or release of important articles, and in other instances are selecting relatively less important articles for expedited publication. In this regard, general internists assigned relatively low mean scores across multiple dimensions (Table 4) for pairs C, D, E and K. Furthermore, the differences in mean scores for the case articles versus the control articles were less than 1 point for all dimensions (Table 3). Perhaps not surprisingly, we did not find any differences between the case and control articles in average study quality or citation counts, but we did find evidence of higher visibility for the prereleased and fast-tracked articles, which may or may not be causatively linked to the expedited publication process.
What might explain our findings? The policies for expedited publication of the New England Journal of Medicine1,2 and The Lancet5,6 do not appear to involve the systematic screening of all articles as potential candidates. Rather, the process appears to be activated only when the authors, editors or peer reviewers specifically request that accelerated publication be considered on the basis of their subjective assessments of importance, which may or may not be explicitly defined. Perhaps recognizing this reality, the editors of this journal recently acknowledged, in an editorial introducing fast-tracking at CMAJ,8 that they were uncertain whether they could “reliably discern which [article] merits acceleration, or which is likely to be genuinely important in the long run.” When considered together with our findings, such expressions of uncertainty ought to incite journal editors to consider developing a unified approach to expedited publication, akin to the consensus positions that have recently been adopted on issues of authorship,44 reporting of clinical trials45 and industry sponsorship.46
One potential limitation of our study is that we asked general internists to rate the importance of the articles. Ratings might have been different had we asked subspecialists to rate the importance of articles in their fields (e.g., obstetricians for the articles on amniocentesis32 and chorionic villus sampling33). We decided to use internists because we recognized that it would be extremely difficult to involve different groups of subspecialists for each subject area represented by these articles. Furthermore, general internists have a broad enough perspective to rate the importance of the wide variety of articles compiled for this study. A second potential limitation of this study is that the clinimetric properties of the scale used by the physicians are not established, since the scale was developed specifically for this study. However, we found evidence for the scale's criterion validity in our data: there was a relatively strong positive correlation between articles' importance scores and their monthly citation frequencies (Pearson r = 0.55, p = 0.006). A third limitation is that there were 2 case articles for which we were unable to find control articles with close subject matches (pairs B and J). However, when we excluded these 2 pairs, we found that the results were essentially unchanged. Although the assessment of methodological quality was not the primary focus of our study, a fourth limitation is that the methodological quality scale that we used proved to be suboptimal in the reproducibility of quality scores assigned to individual articles. However, the 2 reviewers nonetheless agreed in their finding of no significant differences in methodological quality between case and control articles. Lastly, we acknowledge that the selection of the control articles was performed by only 1 of us (W.A.G.) and may thus not be replicable. However, we would argue that, even if this person had embarked on a targeted, nonrandom, and even biased, search for important control articles, our study would nonetheless prove an important point: articles published in the usual way are occasionally judged to be more important than prereleased or fast-tracked articles from the same journal.
Despite these acknowledged limitations, our study does generate useful information. Our results lead us to conclude that policies for expedited publication are, on average, targeting important articles and may be contributing to the visibility of research findings. However, journals now need to find ways to consistently rate the importance of every submitted article so that all important articles can be objectively considered for accelerated peer review and publiation. We hope that this study will stimulate dialogue among the editors of journals that offer accelerated publication.
Footnotes
-
This article has been peer reviewed.
Contributors: Dr. Ghali designed the study, recruited subjects (Calgary), compiled and analyzed data, and wrote the manuscript. Dr. Cornuz contributed to the study design, reviewed study materials, recruited subjects (Lausanne), assisted in compilation of data and provided critical comments on the manuscript. Dr. McAlister conducted detailed methodological quality reviews of selected articles, recruited subjects (Edmonton) and provided critical comments on the manuscript. Dr. Wasserfallen conducted detailed methodological quality reviews of selected articles and provided critical comments on the manuscript. Dr. Devereaux arranged for compilation of article citation counts, recruited subjects (Halifax) and provided critical comments on the manuscript. Dr. Naylor contributed to the study design and provided critical comments on the manuscript.
Acknowledgements: We thank the 42 physicians who volunteered their time to rate the articles for this study: University of Calgary — M. Mintz, S. Ernst, J. Schaefer, J. Walsh, R. Dear, T. Pedersen, R. Hull, N. Khan, J. Mellor, N. Campbell and E. MacKay; University of Alberta — S. Majumdar, T.K. Lee, R. Padwal, G. Hrynchyshyn, B. Wirzba, N. Gibson, N. Kassam and B. Fisher; Dalhousie University — B. O'Brien, S. Workman, T. Dean and A. Borzecki; University of Lausanne — J.C. Luthi, L. Portmann, G. Pralong, F. Pralong, O. Lamy, J.B. Daeppen, A. Broccard, J. Ruiz, G. Waeber, P. Vollenweider, V. Mooser, M. Schapira, T. Buclin, D. Graf, J.M. Meier, A. Berger, P.A. Bart, P. Staeger and D. Beer.
Dr. Ghali was hosted for a 6-month sabbatical leave by the Department of Medicine and the Institute for Social and Preventive Medicine, University of Lausanne, Lausanne, Switzerland. Drs. Ghali and McAlister are supported by Population Health Investigator Awards from the Alberta Heritage Foundation for Medical Research. Dr. Ghali is also supported by a Government of Canada Research Chair. Dr. Devereaux is supported by a Heart and Stroke Foundation of Canada/ Canadian Institutes of Health Research Fellowship Award.
Competing interests: None declared.