Keywords
COVID-19, Coronavirus, Communicable diseases, Bibliometric study
This article is included in the Research on Research, Policy & Culture gateway.
This article is included in the Emerging Diseases and Outbreaks gateway.
This article is included in the Coronavirus collection.
COVID-19, Coronavirus, Communicable diseases, Bibliometric study
Coronaviruses are RNA viruses widely found among many mammal species, including human beings1. Although these viruses generally have low virulence, two epidemics by severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) are considered as major public health events in the past two decades. The case fatality rates were 10% and 37% for SARS-CoV2 and MERS-CoV3, respectively. In December 2019, a novel coronavirus emerged in Wuhan City, Hubei Province of China4. This outbreak was unique in terms of high pathogenicity and mortality compared to the earlier epidemics by coronaviruses5. Soon cases affected by novel coronavirus were found outside Wuhan and eventually around the world. On January 30, 2020, the World Health Organization (WHO) declared the outbreak a public health emergency of international concern6. Later, the WHO named the disease a novel coronavirus as “COVID-19”, which is a short form of “coronavirus disease 2019” on February 11, 20207. With a growing number of new cases and increased mortality attributable to COVID-19 pandemic8, global health discourses among the scientific community, policymakers, and the general population are emphasizing on what is known about this virus. Although it is known that COVID-19 is uniquely different to SARS-CoV and MERS-CoV, the scientific knowledge on COVID-19 remains limited within the scope of recently published articles. It is essential to understand the evolution of emerging scientific knowledge on COVID-19 to inform further research as well as evidence-based policymaking.
Bibliometric analysis quantitatively examines the research progress of any topic and offers a comprehensive assessment of scientific research trends, which is widely used for mapping knowledge in different scientific disciplines9–11. As a research method, bibliometric study was used back in 1917 by Cole and Eales, who studied the growth of scientific production of articles published in the field of comparative anatomy12. This approach was subsequently termed as “Bibliometrics” by eminent British scientist Allen Richard13. Over the years, bibliometric studies have been used for analyzing a topic or field emerging in the global knowledge landscape and evaluating the evolution of research over time14–16. More importantly, it provides critical insights on most prolific authors, institutions, countries of affiliation, thematic change within the domain itself, co-citations, co-authorships, and holistic development of the field(s) of interest9,13,16.
With a growing interest in COVID-19 related research across the globe, a bibliometric study may inform the current status of global research and provide meaningful insights on future research. An earlier bibliometric analysis evaluated the scientific literature on different coronaviruses17. However, there is no bibliometric analysis available to date that specifically focuses on contemporary scientific development on COVID-19. This study aimed to address this knowledge gap and conducted a bibliometric analysis to evaluate the characteristics of the current body of literature on COVID-19, identify the prolific authors, institutions, and countries involved in COVID-19 research, and examine the evolution of key knowledge areas within COVID-19 related studies.
For this study, bibliometric data were collected from the Science Citation Index Expanded, Social Sciences Citation Index, and Emerging Sources Citation Index databases within Web of Science (WoS) core collection. These databases within WoS are maintained by Clarivate Analytics, which offer the world’s leading scientific citation search and analytical information platform18. Collectively WoS collection provides enriched bibliometric data useful for citations analytics and mapping the knowledge in a given domain by examining leading authors, institutions, and collaborating nations working in a given domain of scientific research.
The following query was administered to retrieve COVID-19 related bibliometric data: “Novel coronavirus” OR “Novel coronavirus 2019” OR “2019 Novel coronavirus” OR “2019 nCoV” OR “COVID-19” OR “Wuhan coronavirus” OR “Wuhan pneumonia” OR “SARS nCoV” OR “SARS-CoV-2”. Considering the timing of the outbreak in late 2019, the search strategy was limited to 2019–2020 to retrieve data that may contain publications on COVID-19 rather than earlier coronaviruses. Also, all search fields were selected including topics, titles, and abstracts to retrieve the bibliometric data ensuring the sensitivity of the search strategy. This search was conducted in February 22, 2019 and updated in April 1, 2020, for the last time. Moreover, no restrictions on languages or publication types were applied due to the low number of publications on this recent topic. The inclusion criteria for this bibliometric study was as followings: a) journal articles published on COVID-19 topic, b) language of the publication was English, c) articles irrespective of their methodology were included, d) studies published between January 1, 2019, to April 1, 2020, were included. Furthermore, articles were excluded if they had conflicts with any of the above-mentioned inclusion criteria. The references of the retrieved articles were not evaluated, therefore, articles retrieved through citations search are the only source of data in this bibliometric study.
After extracting bibliometric data from WoS, the citations were uploaded to RefWorks (freely available alternative: Mendeley), which is a cloud-based software for citation management. Further, they were screened as per the criteria described earlier, and then finally recruited citations were uploaded to R (Version 3.6.1). Using this software, descriptive analyses were conducted to evaluate the characteristics and types of documents. Also, the WoS metrics were used to assess the top ten impactful articles in the literature in terms of citations, the top 10 authors and journals based on the number of published documents on COVID-19. In addition, co-authorship among all the authors in the bibliography was assessed, and an evaluation of how many of them were connected within documents authored or co-authored by individuals was conducted.
Further, the affiliating institutions and countries of the respective authors were mapped using a network analysis approach. This set of analyses allowed to evaluate the nature and magnitude of collaboration at the individual, institutional, and international levels and how such collaborated impacted the knowledge base on COVID-19. Also, keywords and texts in titles and abstracts within scientific documents were identified and evaluated using text-mining approaches using shiny package in R (Version 3.6.1). At this stage, network analyses were conducted to assess the connectedness among those documents and related keywords. Furthermore, the cooccurrence of multiple authors, keywords, institutions, and countries, different thresholds were used to create visualizations of frequency distributions for each variable, whereas all entries within each variable were assessed for the same threshold to ensure equitable comparisons within respective fields of analyses. Furthermore, the relational mapping among the authors, institutions, countries, and common keywords were created using VOSviewer software, which is a bibliometric tool for visualization of citations data. In this mapping process, networks were developed at different thresholds for authors (n = 3 documents per author), institutions (n = 7 documents per institutions), countries (n = 1 document per country), and keywords (n = 3 cooccurring keywords). In addition, a multi-dimensional scaling approach was used to conduct a factorial analysis of 50 most-occurring research terms in the bibliometric data in R package as stated earlier. This allowed constructing a conceptual structure map depicting hierarchical relationships among knowledge areas within the research landscape of COVID-19.
A total of 422 bibliometric records were recruited in this study, which were authored by 1652 authors with 3.91 authors per document (Table 1). Most documents (n = 1581) had multiple authors, and the mean citations received per document was 2.47. A major proportion of the publications were articles (33.41%) and editorials (32.23%).
In addition, top ten articles based on the number of citations in WoS were identified (Table 2), which included genetic, epidemiological, and clinical studies on COVID-19. Among the authors, Mahase E. has the highest number of publications (n = 13) followed by Akhmetzhanov AR., Linton NM., Nishiura H., and Zhang W. with seven publications per author. Also, top ten journals were identified that published the highest number of documents, which include British Medical Journal (n = 47) followed by The Lancet (n = 37), Eurosurveillance (n = 22), Journal of Medical Virology (n = 22), and Intensive Care Medicine (n = 13).
A network map of co-authors who contributed to COVID-19 research was created at the threshold of 3 documents per author, which found 52 collaborating authors, as illustrated in Figure 1. Scattered zones show groups of collaborating authors, whereas connections between individuals and groups are plotted accordingly. Figure 2 shows the network of collaborating institutions that were affiliated with at least seven documents on COVID-19 research. This threshold identified 16 collaborating institutions, including Capital Medical University (number of documents, nd = 13; number of citations, nc = 237), Huazhong University of Science and Technology (nd = 18, nc = 167), and Wuhan University (nd = 15, nc = 154).
Further, the bibliometric records were analyzed for the contributing countries and a network was developed using co-authorship among scholars from those nations. Any collaborating country with at least one publication was included in this map (Figure 3). Among global nations, China has the highest number of documents (n = 185), followed by the US (n = 68), UK (n = 36), Italy (n = 23) and Canada (n = 23).
In this study, a visualization guided by quantitative evaluation of the cooccurrence of keywords was prepared, as depicted in Figure 4. A threshold of at least three cooccurrence of a keyword was set to identify the most frequent research terms indexed in the literature, which revealed a total of 69 keywords. The top ten cooccurring words were “coronavirus” (n = 69), “sars” (n = 47), “2019-ncov” (n = 43), “covid-19” (n = 44), “sars-cov-2” (n = 26), “pneumonia” (n = 25), “wuhan” (n = 18), and “outbreak” (n = 18).
A factorial analysis was conducted among the leading 50 key terms in the bibliometric data using a multi-dimensional scaling approach (Figure 5). This analysis resulted in a dendrogram of repeatedly co-appearing keywords in hierarchical clustering, which highlights conceptual structures in the research field. The first cluster in this dendrogram (in blue) included research terms including diversity, multiple sequence alignment, and sars-like coronaviruses. Another cluster (in red) comprised of research terms related to pathogenicity of coronavirus outbreak, earlier outbreaks with other typologies, epidemiology, and diagnostic approaches. Both structures shared several common thematic areas, including zoonotic connections in COVID-19 epidemiology and genetic and molecular properties of interest in COVID-19 research.
This bibliometric study and knowledge mapping identified contemporary scientific documents on COVID-19 from scholarly sources. The findings of this study reflect the recent scholarly growth of the global body of knowledge on COVID-19. Most documents had multiple authors from different collaborating institutions and nations, which highlight the productivity of scientific activities. Moreover, research keywords presented in the bibliometric data reflect the complexity and inclusion of multiple disciplines like virology, microbiology, infectious diseases, clinical medicine, public health, allied health sciences, social sciences, and other branches of knowledge. Such scholarly growth of the knowledge base may help in understanding the ontology and phenomenology of a new global health challenge imposed by COVID-19.
Notably, the affiliating institutions and nations collaborated in COVID-19 research may inform the utility of global research collaboration, particularly during complex public health problems where multiple stakeholders from different institutions and contexts may offer diverse resources and competencies in addressing knowledge gaps in a more efficient manner compared to individualistic approaches. This can be profoundly challenging for low- and middle-income countries, like nations in Africa and South Asia, who have suboptimal research capacities and poor evidence-base to make informed decisions on highly prevalent health problems19.
Another challenge is a critical lack of technological infrastructure in such contexts, which reinforces the need to strengthen global collaborations for research and evidence synthesis. For example, resource-constrained contexts have lesser availability and accessibility to advanced technologies, which may limit their abilities to conduct research requiring tools like deep learning or other computational approaches20,21. Also, this may restrict opportunities for substituting time-intensive lab-based research through simulation or increase the speed and quality of research processes. This may be a reason for the lack of representation of studies from countries in South Asia, South America, and Africa. Future efforts should focus on strengthening research capacities in those contexts is essential to improve regional and global knowledge on persisting and emerging diseases affecting global populations.
Also, it is essential to acknowledge the need for global collaborations as the magnitude of the problem necessitates a series of large-scale analyses, exchange of perspectives, knowledge synthesis, and translating the same to inform evidence-based policies and practices22,23. More importantly, increased collaborations in research are likely to facilitate trust and cooperation in developing scalable solutions globally, minimizing the cost and maximizing human benefits beyond borders24,25. Lessons learned from research collaboration can foster hope in existing global health disparities, particularly in developing vaccines and other preventive solutions26,27. These aspects are critical for the overall development of COVID-19 related research and practice as the current evidence on collaboration shows scattered growth of research groups, which may affect the true potential that collaborative efforts may offer in this scenario.
This study identified top keywords that appeared in scientific literature and demonstrated how they co-appeared across studies taking intellectual roots from earlier studies. Moreover, an evolution of conceptual structure using those keywords inform the current scenario of uncontrolled observations retrieved from global studies. Keywords are useful not only to retrieve studies from databases or topics within studies, but they also tell the scientometric themes underlying the information presented in a document28,29. In addition, these conceptual constructs may inform future scientific measures to define and distinguish how sub-domains within the knowledge base on COVID-19. Furthermore, similar keywords appeared in multiple documents in this study, among which a one-third were original articles, which informs the early stage of research. This early finding may offer critical directions of scientific development in this knowledge domain. Also, this study found more frequent appearances of epidemiological, genetic, and molecular biological keywords in COVID-19 studies, whereas some keywords indicated an emergence of zoonotic topics, including other animals related to the human food chain or ecology.
It is notable that the cooccurrence of keywords analysis or conceptual structure mapping did not find significant presence of social, economic, political, or cultural determinants of COVID-19 in the global landscape. It is increasingly being recognized that neither disease nor health can happen in isolation from the complex web of those determinants of human lives30–33. The findings of this study highlight this gap, which necessitates further multi-sectoral research on how different determinants can be associated with higher or lower risks of COVID-19 among individuals or populations. In addition, programs and policies for addressing epidemic outbreaks may influence physical and psychosocial health outcomes in diverse population groups34,35, which remains another potential area for future research. Moreover, there is a lack of research that may inform the preventive measures like vaccinations, pharmacological interventions, clinical prognosis, and outcomes of COVID-19. Maybe such studies yet to be available in the future, which will enrich future scientometric analyses and evidence mapping processes.
Another issue is the existing literature mentions little about the psychosocial and economic consequences of COVID-19. A major public health crisis like COVID-19 can affect those aspects of lives and create lasting problems among the affected populations36,37. Perhaps it is too early to estimate such impacts or get them published in indexed sources, which would need more research and timely communication across journals and other media. In addition to epidemiological and genetic studies, psychological, econometric, and social sciences research assessing those concurrent and future challenges should be prioritized to improve the knowledge base in those areas.
This study has several limitations that must be acknowledged to apprehend the findings and address those limitations through future research. First, this study used three databases from WoS core collection, which may have included most studies in a given domain, whereas it may exclude studies that are exclusively indexed in other databases. This may affect the generalizability of the findings. Second, newly published studies may take some time to get indexed in WoS, which could be sourced from searching individual journals that published articles related to COVID-19. A similar gap exists in terms of preprints that are available in respective servers, and insights from those articles cannot be reflected in this study. Also, research studies may take time to get published as journal articles and to be indexed in associated databases, which may also limit the scope of current literature to reflect contemporary knowledge. Third, a bibliometric analysis provides an overview of the evolution of a knowledge domain that is methodologically different than approaches used in clinical reviews. Such reviews may have different objectives and methods of synthesis, which were beyond the scope of this study. However, this study evaluated the knowledge evolution on COVID-19, which may have long-term impacts on the field of COVID-19 studies and future discourses on public health emergencies. The above-mentioned issues should be considered to use the findings of this study and conduct future research and evidence synthesis on COVID-19 addressing those challenges.
A public health emergency, like the COVID-19 pandemic, may affect different frontiers of human lives globally. To solve such problems, it is necessary to fully understand the problem and solutions that may address this. This need for knowledge is a fundamental force that keeps science alive and allows scientists to thrive in their research domains bringing the best possible methods and materials to answer real-life questions. Solving a complex public health problem like COVID-19 needs robust knowledge generated through rigorous methods specific to each problem related to different dimensions of COVID-19 as well as the lives of millions of people around the world. This study provided a global bibliometric evaluation of COVID-19 related studies, which may facilitate ongoing and future research. Such academic and professional efforts in understanding COVID-19 and addressing the same will be informed by the knowledge base we have today, which will continue to evolve over time, enriching science and societies globally.
Open Science Framework: Bibliometric Analysis of COVID-19 Research, https://doi.org/10.17605/OSF.IO/SJB8638. Registered on 11th May 2020, https://osf.io/65v2z
The author is thankful to Dr. Noor Al Quddus, Post-doctoral Researcher at the Mary Kay O'Connor Process Safety Center, Texas A&M University, for his valuable support and encouragement in conducting this study.
A previous version of this article is available on the preprint server SSRN: http://dx.doi.org/10.2139/ssrn.3547824.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Emerging infectious diseases
Is the work clearly and accurately presented and does it cite the current literature?
Partly
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
References
1. Tran B, Ha G, Nguyen L, Vu G, et al.: Studies of Novel Coronavirus Disease 19 (COVID-19) Pandemic: A Global Analysis of Literature. International Journal of Environmental Research and Public Health. 2020; 17 (11). Publisher Full TextCompeting Interests: No competing interests were disclosed.
Reviewer Expertise: COVID-19 research
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 18 May 20 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)