Introduction

Clinical practice guidelines (CPGs) and position statements are widely adopted ways to disseminate medical knowledge and influence clinical practice. Methods for reliable guideline development are available.1,2 The Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach is one of the broadly accepted methodologies for ensuring the lowest risk of bias.3-6 However, the risk of producing less reliable documents persists,7-8 which may translate into less than optimal management of patients. Such risk may be higher when the guidelines are developed rapidly, as observed during the COVID-19 pandemic. To investigate such possibility we decided: 1) to assess the methods used to formulate clinical CPGs and position statements containing recommendations on the pharmacological treatment of SARS-CoV-2 infection, published during the early phase of the COVID-19 pandemic (before data from randomized controlled trials [RCTs] became available); 2) to identify recommendations for antiviral pharmacotherapy that were based on very low–quality evidence available in the early phase of the COVID-19 pandemic and to search for factors associated with the publication of such recommendations; 3) to provide suggestions that would facilitate the development of guidelines in the future in cases when adequate quality evidence is lacking.

Methods

We conducted a retrospective analysis of guidelines on COVID-19 management. Included documents were published before April 1, 2020; were written in English or had an English translation available; were published in journals or on the websites of scientific societies, regulatory bodies, or scientific institutions; considered antiviral therapies; and contained a recommendation (or recommendations) for clinicians. Those documents were defined for the purpose of this study as: 1) a statement with recommendation, advice, suggestions, or tips; and / or 2) a statement that it is not possible to provide any recommendation because of the lack of data considered adequate. Adaptations of the existing guidelines were also included. Papers indexed as original articles or reviews were excluded.

We performed the search for relevant documents between May 15 and 22, 2020, using MEDLINE and Embase databases as well as Google search. An additional search was performed to detect if any update of the previously identified guidelines had been released within the 2 months after the publication of the first RCT on antiviral COVID-19 treatment.9 Details on the search strategy are presented in Supplementary material. Identified publications were screened based on the title and abstract (database search) or full-text documents (Google search). After duplicate removal, full-text documents were assessed for eligibility by 2 authors (FM and WL) independently (flowchart for study selection is presented in Supplementary material, Figure S1). Any discrepancies between evaluators during the study selection process were resolved by consensus.

The quality of guidelines was assessed using the Appraisal of Guidelines for Research and Evaluation II–Global Rating Scale (AGREE II–GRS) instrument.10 This scale includes 4 core items (process of development, presentation style, completeness of reporting, and clinical validity), each rated from 1 (lowest quality) to 7 (highest quality). Additionally, there is an overall assessment of guideline quality, similarly ranging from 1 (lowest quality) to 7 (highest quality), and 2 additional items based on users’ answers to questions whether they would recommend the guideline for use in practice and whether they would use a guideline of that quality to make their own professional decisions (both rated from 1 = strongly disagree to 7 = strongly agree). Each document was assessed by 2 appraisers and the mean scores were reported.10

In addition, to characterize the rigor of guideline development more precisely, we evaluated the included documents using a series of dichotomous criteria based on domain 3 of the AGREE II tool itself and World Health Organization (WHO) standards.2,11,12 Specifically, we verified if the assessed documents: used the existing methodology of guideline development; contained data on establishing the working group or searching for evidence; provided strength of recommendations and references to recommendations; included conflict of interest information; rated quality of evidence; provided updates when new evidence was available (which we defined as within 2 months from the publication of the first RCT [lopinavir / ritonavir efficacy in COVID-199]); as well as sought opinions of external reviewers. We classified documents as containing strong recommendations (using phrases such as “is recommended,” “should be used”) or weak recommendations (suggestions, using phrases such as “may be used,” “consider the use,” etc), or no recommendation for use of antiviral therapy (recommendation not to use antivirals outside clinical trials or a statement that evidence is lacking). Detailed information on data extracted is provided in Supplementary material.

Two authors (FM and WL) extracted the data and assessed the quality of CPGs (data extracted are listed in Supplementary material). Any disagreements were resolved by consensus.

Statistical analysis

Statistical analysis was performed using the Statistica software, version 13.3 (Tibco Software, Inc., Palo Alto, California, United States). The assumption of normality was verified with the Shapiro–Wilk test. Between-group comparisons were conducted with the Kruskal–Wallis analysis of variance. Multivariable logistic regression analysis was used to identify variables independently associated with the recommendations for antiviral therapy for SARS-CoV-2 infection. The presence of recommendations for antiviral therapy for SARS-CoV-2 infection was regarded as the dependent variable, and the document type, mode of publication, and endorsing body were used as covariates in the first model, with the following variables added to subsequent models in a stepwise manner: using the existing methodology of guideline development, the presence of data on how the working group was established, description of search for evidence, the presence of conflict of interest information, and rated quality of evidence. Because of low numbers in total and in groups, the number of potential independent predictors were restricted to a maximal value of 2 per model. A P value less than 0.05 was considered significant.

Ethics

No ethics committee approval was required for this study.

Results

The final analysis included 40 publications, of which 17 were clearly labelled as CPGs. The flowchart for study selection and the list of included documents are presented in Supplementary material. The quality of most documents, as assessed with the AGREE II–GRS tool, was poor, except a single document that scored maximum points (Surviving Sepsis Campaign Guidelines)13 and 2 other documents that were of adequate quality (produced by the WHO14 and the American Thoracic Society–led International Task Force15) (Figure 1; detailed scores for all included documents are presented in Supplementary material, Table S1). The AGREE II–GRS scores did not differ across most categories (ie, the type of the document, endorsing body, or mode of publication) (Table 1). Most documents did not fulfill the rigor of guideline development quality criteria (Table 2; detailed assessment for all included documents is presented in Supplementary material, Table S2).

Figure 1. Distribution of the AGREE II–GRS scores for the identified documents (1 denotes the lowest quality, and 7, the highest quality)

Table 1. The AGREE II–GRS score of the identified documents

Characteristics

N (%)

Overall quality assessmenta, median (IQR)

Total

40 (100)

2 (1.5–2.5)

By the type of the document

Clinical practice guidelines

17 (42.5)

2 (1.5–2.5)

Guidance

21 (52.5)

2 (1.5–2.5)

Statement or similar opinion

2 (5)

1.5 (1–2)

By the endorsing body

International organization

1 (2.5)

3.5

International scientific medical society

5 (12.5)

2 (2–2.5)

National governmental organization

8 (20)

1.5 (1.5–2)

National scientific medical society

4 (10)

2.3 (2–2.8)

Local scientific medical society

1 (2.5)

1.5

International group of experts

4 (10)

2 (1.8–3.3)

National / local group of experts

6 (15)

1.75 (1–3.5)

Single institution

12 (27.5)

2 (1–2)

By the mode of publication

Published in a peer-reviewed journal

12 (30)

1.75 (1.5–2)

Not published in a peer-reviewed journal

28 (70)

2 (1.5–2.5)

a Mean from the scores of 2 evaluators ranging from 1 (lowest) to 7 (highest)

Abbreviations: IQR, interquartile range

Table 2. Proportion of documents (out of 40) that fulfilled the quality criteria for guideline development (based on the AGREE II tool and World Health Organization standards2,11)

Quality criterion

N (%)

Establishment of the working group described

5 (12.5)

Search strategy for evidence presented

4 (10)

Existing methodology for guideline development used

3 (7.5)

Information on the strength of recommendations provided

4 (10)

Quality of evidence rated

4 (10)

External reviewers included

2 (5)

Conflict of interest information provided

12 (30)

Document updated within the next 2 months

12 (30)

Overall, 75% of documents (n = 30) provided recommendations for the use of antivirals, of which 12.5% (n = 5) provided strong, and 62.5% (n = 25) weak recommendations. There were no significant differences in the proportion of documents that contained recommendations for the use of antivirals across the type of documents, mode of publishing (peer-reviewed journal vs website), or the type of an endorsing body. Documents that contained recommendations supporting antiviral drug use tended to be of lower quality (P = 0.11; Figure 2) than those without such recommendations, and the presence of strong recommendation was associated with the lowest quality. In the logistic regression analysis, no variables consistently associated with recommendations for the use of pharmacological treatment were identified. Of the included documents, 25% were updated within the 2 months following the publication of the first RCT on COVID-19 antiviral treatment.9

Figure 2. AGREE II–GRS total quality score by the type of recommendation for use of antiviral drugs. Dots represent median values, and whiskers, interquartile ranges.

Discussion

We found that most of documents providing information on COVID-19 treatment that were published within about 3 weeks after the WHO declared the state of pandemic16 were of poor quality and developed without the use of the widely accepted methods.

During the outbreak of the COVID-19 pandemic, clinicians sought treatment options. Guidelines and similar documents constituted crucial sources of information on management. Our findings demonstrate that most of those documents published in the early phase of the pandemic were of low quality and thus potentially misguiding clinical practice. The potential bias might have been related to various factors: 1) readiness to issue recommendations in the setting of inadequate evidence; 2) failure to update once evidence becomes available; 3) enthusiastic interpretation of evidence (eg, overly strong recommendations in the setting of low quality of evidence; 4) inappropriate interpretation of evidence (eg, due to conflict of interest). Moreover, the quality was poor irrespective of the institution that had published the document: in most instances, neither governmental organizations nor professional societies were able to ensure that the basic quality criteria were met. The same was found for peer-reviewed journals. It could be argued that the poor quality of the evaluated documents was due to time pressure or resource constraints, yet accepting this one needs to point out that some organizations were providing guidelines of higher quality. In addition, some of the reviewed documents were developed before the countries involved were actually struck by the pandemic, so we would not consider time constraints to be an explanation for the poor quality.

In the first months of the COVID-19 outbreak, the only available data on the effectiveness of antiviral drugs for SARS-CoV-2 infection were derived from in vitro studies and, indirectly, from the Middle East respiratory syndrome and Severe Acute Respiratory Syndrome epidemics.17-21 Based on those data, some drugs were claimed to be potentially effective.22-24 However, the claims were not supported by robust evidence, and the results of multiple RCTs emerged only few months later.25 In our opinion, no data justified their routine use, and in March 2020, the WHO stated: “Use of investigational anti-COVID-19 therapeutics should be done under ethically approved, randomized, controlled trials.”14 One may easily venture that unjustified recommendations presented in many of analyzed documents contributed to different clinical behaviors, with frequent use of unproven therapies (example: hydroxychloroquine)26 with significant adverse effects confirmed later.27 Of note, most prestigious international panels including the WHO and Surviving Sepsis Campaign were ready not to issue recommendations and urged rapid research instead.13,14

Admittedly, our analysis had several limitations. First, we decided to include a broad range of document types including guidance documents and statements. However, when the analysis was limited to documents labelled by authors as CPGs, the results remained unchanged. We deliberately included various types of documents to identify as many documents on which physicians base their decisions as possible. Another potential limitation of our study was the use of the AGREE II–GRS tool to measure guideline quality, which, as a simplified version of the AGREE II tool, is less precise and less widely used.28 However, in our opinion, the use of a more precise tool would not affect the main conclusion, because the quality of most documents was very low, with the majority of them not even fulfilling basic quality criteria.

We strongly believe that even during the pandemic all guidelines and guidance documents should be developed using the validated methodology (the use of GRADE is highly recommended) and regularly updated, preferably immediately after new evidence becomes available. The latter is especially important during the COVID-19 pandemic, when the number of publications is growing rapidly29 and physicians’ opinions on optimal pharmacotherapy is still varied.30 Recommendations for ineffective antiviral drug use in numerous official documents might not only have resulted in patients receiving unnecessary, or even harmful, treatment, but might also have been one of the factors limiting recruitment to RCTs in the first months of the pandemic.

Conclusions

Our findings indicate that in the initial stages of the pandemic, practice advice and / or recommendations were of generally poor quality while including recommendations (frequently strong) for antiviral therapy. This observation should be of help to those advocating new therapies in the current scenario, and possibly in future clinical situations, and those following such advice in their own practice.