To the Editor,

Since the first report of a cluster of unexplained pneumonia cases in Wuhan, China on December 31, 2019, there has been a veritable avalanche of research publications on COVID-19 to better understand the virus and its impact. We sought to quantify the pace of publication and describe characteristics of abstracts pertaining to COVID-19 in PubMed since the beginning of the calendar year using an automated approach.

We conducted a search of PubMed using R version 4.0.2 and the National Library of Medicine’s (NLM) E-utilities application programming interface (API) and query string: ((wuhan[All Fields] AND (“coronavirus”[MeSH Terms] OR “coronavirus”[All Fields])) AND 2019/12[PDAT]: 2030[PDAT]) OR 2019-nCoV[All Fields] OR 2019nCoV[All Fields] OR COVID-19[All Fields] OR SARS-CoV-2[All Fields]. All results from January 1, 2020 to November 02, 2020 were included. Details, including date of publication, country, language, publication type and journal name were extracted.

A total of 57,263 articles were included in our analysis. 19,469 (34.0%) were ahead of print, 14,383 (25.1%) were e-published, and 23,411 (40.9%) were published in print at the time of data extraction. Over the 43-week period, a median of 1682 articles were published per week. There was a peak of 2277 articles published the week of May 11th (Fig. 1). The United States accounted for the most publications (20,460 [35.7%]) followed by England (15,471 [27.0%]) and the Netherlands (4980 [8.70%]). Most publications were in English (56,114 [98.0%]) with small percentages in Spanish (379 [0.66%]), German (231 [0.40%]), and French (225 [0.39%]). The preprint servers, medRxiv (609 [1.1%]) and bioRxiv (479 [0.8%]), were among the top five sources with most publications.

Fig. 1
figure 1

Weekly number of publications related to COVID-19 from January 1, 2020 to November 02, 2020 in PubMed. Legend: The x axis indicates the weeks from January 1, 2020 to November 02, 2020. Weeks are defined by the Monday to Sunday period with x-axis tick marks indicating the date of the Monday starting each week

The COVID-19 pandemic has been accompanied by an unprecedented rate of scientific publication that has overwhelmed frontline providers and the public health community. Our analysis found a total of 57,263 articles in PubMed on COVID-19, compared to only 3386 articles regarding Influenza H1N1 pandemic during the initial 43-week period from April 20, 2009 to February 15, 2010 [1].

Using the NLM’s official search string in Medline, which is updated daily, we were able to capture the latest articles and ahead of print articles in our analysis. However, limiting our search to a single database also restricted our analysis to mostly English language articles, a majority of which were published in the US and a few European countries. While we did collect data on publication type, it was not adequately characterized for any meaningful interpretation. We also did not include all of the preprint servers in our search; however, PubMed now includes preprints from authors with either affiliation or support from the National Institute of Health as part of a new pilot program [2]. Preprint articles have become an increasingly popular avenue for researchers to share their findings before a formal peer review process and have emerged as significant drivers of discourse in the scientific community.

A number of free and easily downloadable article databases, notably from the Centers for Disease Control and Prevention and even NLM, have been created to help clinicians and researchers parse through the plethora of information on COVID-19 [3]. However, these databases also contain thousands of articles, still making it difficult to efficiently search for answers to specific queries. Since there are still many unanswered questions regarding COVID-19, additional research will be needed. Concurrently, there is an emergent need to evaluate and prioritize the quality of the literature that is being published at an astronomical rate to help ease the burden on the consumers of this information who are trying to make important medical and public health decisions in a constantly changing environment.