Keywords
COVID-19, SARS, 2019-nCoV, 3C-like protease, drug repurpose, antiviral, coronavirus, virtual screening, molecular modelling, ledipasvir, velpatasvir, Hepatitis C virus, HCV
This article is included in the Emerging Diseases and Outbreaks gateway.
This article is included in the Coronavirus collection.
COVID-19, SARS, 2019-nCoV, 3C-like protease, drug repurpose, antiviral, coronavirus, virtual screening, molecular modelling, ledipasvir, velpatasvir, Hepatitis C virus, HCV
On 7 January 2020, a new coronavirus, 2019-nCoV (now officially named SARS-CoV-2) was implicated in an alarming outbreak of a pneumonia-like illness COVID-19, originating from Wuhan City, Hubei, China. Human-to-human transmission was first confirmed in Guangdong, China1. The World Health Organisation has declared this a global public health emergency — on 15 February 2020, there are more than 65,000 confirmed cases reported, and the death toll is over 1500. In the height of the crisis, this virus is spreading at a rate and scale far worse than previous coronaviral epidemics.
It was immediately evident from its genome that the coronavirus is evolutionarily related (80% identity) to the beta-coronavirus implicated in the severe acute respiratory syndrome (SARS), which originated in bats and was causative of a global outbreak in 2003. The momentum of research on developing antiviral agents against the SARS-CoV carried on after the epidemic subsided. Despite this, no SARS treatment has yet come to fruition; however, knowledge acquired from the extensive research and development efforts may be of use to inform the current therapeutic options.
The viral genome encodes more than 20 proteins, among which are two proteases (PLpro and 3CLpro) that are vital to virus replication; they cleave the two translated polyproteins (PP1A and PP1AB) into individual functional components. The 3-chymotrypsin-like protease (3CLpro, aka main protease, Mpro) is considered to be a promising drug target. Tremendous effort has been spent on studying this protein in order to identify therapeutics against the SARS-CoV in particular and other pathogenic coronaviruses (e.g. MERS-CoV, the Middle East respiratory syndrome coronavirus) in general because they share similar active sites and enzymatic mechanisms. The purpose of this study is to build a molecular model of the 3CLpro of the SARS-CoV-2 and to carry out virtual screening to identify readily usable therapeutics. It was not our intention, however, to comment on other structure-based drug design research as these will not be timely for the current epidemic.
The translated polyprotein (PP1AB) sequence was obtained from the annotation of the GenBank entry of the SARS-CoV-2 genome (accession number MN908947). By comparing this sequence with the SARS-CoV PP1AB sequence (accession number ABI96956), the protease cleavage sites and all mature protein sequences were obtained. Sequence comparison and alignment were performed with BLASTp.
The high-resolution apo-enzyme structure of SARS-CoV 3CLpro (PDBID: 2DUC)2 was employed as the template. The variant residues were “mutated” in silico by SCWRL43, followed by manual adjustment to ensure that the best side-chain rotamer was employed (Table 2). The rebuilt model was subjected to steepest descent energy minimisation by Gromacs 2018.4 using the Gromos 54A7 forcefield, with a restraint force constant of 1000 kJ mol-1 nm-2 applied on all backbone atoms and all atoms of the vital residues (Table 1). Accessible surface area of residues were calculated with areaimol of the CCP4 suite v7.0.
Function | Residue Number | Reference |
---|---|---|
Catalytic | 41, 145 | 8 |
Substrate binding | 41, 49, 143–144, 163–167, 187–192 | 2,9 |
Dimerisation | 10, 11, 14, 28, 139, 140, 147, 298 | 10–13 |
SARS-CoV-2 variants | 35, 46, 65, 86, 88, 94, 134, 180, 202, 267, 285, 286 | This work |
MTiOpenScreen web service4 was used for screening against its library of 7173 purchasable drugs (Drugs-lib), with the binding site grid specified by the active-site residues. The active sites on chain A and chain B were screened independently with AutoDock Vina5. When the crystal structure was released, it was stripped of its inhibitor and subjected to a screening. A list of 4,500 target:ligand docking combinations ranked by binding energies was produced for each screen. The top 10 or 11 (ranked using a binding energy cut-off) hits for chains A and B were examined visually in PyMOL (version 1.7.X)6.
An earlier version of this article can be found on ChemRxiv (DOI: 10.26434/chemrxiv.11831103.v2).
The first available genome was GenBank MN908947, now NCBI Reference Sequence NC_045512. From it, the PP1AB sequence of SARS-CoV-2 was extracted and aligned with that of SARS-CoV. The overall amino-acid sequence identity is very high (86%). The conservation is noticeable at the polyprotein cleavage sites. All 11 3CLpro sites2 are highly conserved or identical (Extended data7, Table S1), inferring that their respective proteases have very similar specificities. The 3CLpro sequence of SARS-CoV-2 has only 12 out of 306 residues different from that of SARS-CoV (identity = 96%).
We compared the polyprotein PP1AB and the 3CLpro sequences among all 11 SARS-CoV-2 genomes (GenBank MN908947, MN938384, MN975262, MN985325, MN988668, MN988669, MN988713, MN994467, MN994468, MN996527 and MN996528) that were available on 1 February 2020. With reference to MN908947 (NC_045512), among the 7096 residues, there is only one variable residue in each of MN975262 (in NSP-4), MN994467 (in NSP-2), MN994468 (in NSP-13), MN996527 (in NSP-16); and two in MN988713 (in NSP-1 and NSP-3). The remaining five have no difference. To summarise, all SARS-CoV-2 3CLpro sequences and all their cleavage junctions on their polyproteins are 100% conserved.
The amino acids that are known to be important for the enzyme’s functions are listed in Table 1. Not unexpectedly, none of the 12 variant positions are involved in major roles. Therefore, we are confident to prepare a structural model of the SARS-CoV-2 3CLpro by molecular modelling (Extended data7, Figure S1), which will be immediately useful for in silico development of targeted treatment. After we submitted the first draft of this study, the crystal structure of SARS-CoV-2 3CLpro was solved and released (PDB ID 6LU7), which confirms that the predicted model is good within experimental errors (Extended data7, Figure S2).
When examined in molecular graphics6, all solutions were found to fit into their respective active sites convincingly. The binding energies of chain A complexes were generally higher than those of chain B by approximately 1.4 kcal mol-1 (Table 3). This presumably demonstrates the intrinsic conformational variability between the A- and B-chain active sites in the crystal structure (the average root-mean-square deviation (rmsd) in Cα atomic positions of active-site residues is 0.83 Å). In each screen, the differences in binding energies are small, suggesting that the ranking is not discriminatory, and all top scorers should be examined. We combined the two screens and found 16 candidates which give promising binding models (etoposide and its phosphate counted as one) (Table 3).
We checked the actions, targets and side effects of the 16 candidates. Among these, we first noticed velpatasvir (Figure 1A, D) and ledipasvir, which are inhibitors of the NS5A protein of the hepatitis C virus (HCV). Both are marketed as approved drugs in combination with sofosbuvir, which is a prodrug nucleotide analogue inhibitor of RNA-dependent RNA polymerase (RdRp, or NS5B). Interestingly, sofosbuvir has recently been proposed as an antiviral for the SARS-CoV-2 based on the similarity between the replication mechanisms of the HCV and the coronaviruses14. Our results further strengthen that these dual-component HCV drugs, Epclusa (velpatasvir/sofosbuvir) and Harvoni (ledipasvir/sofosbuvir), may be attractive candidates to repurpose because they may inhibit two coronaviral enzymes. A drug that can target two viral proteins substantially reduces the ability of the virus to develop resistance. These direct-acting antiviral drugs are also associated with very minimal side effects and are conveniently orally administered (Table 4).
Drug | Possible side effects (adverse reactions) | Admin. |
---|---|---|
Diosmina,b | Mild gastrointestinal disorders; skin irritations; nausea; heart arrhythmias | Topical; oral |
Hesperidina,d | Stomach pain and upset; diarrhea; headache | Oral |
MK-3207c | No information | Oral |
Venetoclaxa,b | Neutropenia; nausea; anaemia, diarrhea; upper respiratory tract infection | Oral |
Dihydroergocristinea | No information | Oral |
Bolazineb | No information | Intramuscular |
R428b | No information | Oral |
Ditercalinium | No information | No info |
Etoposidea,b | Alopecia; constipation; diarrhea; nausea; vomiting; secondary malignancies | Intravenous |
Teniposidea,b | Gastrointestinal toxicity; hypersensitivity reactions; reversible alopecia | Intravenous |
UK-432097c | No information | Inhaled |
Irinotecana,b | Gastrointestinal complication | Intravenous |
Lumacaftora | Dyspnea; nasopharyngitis; nausea; diarrhea; upper respiratory tract infection | Oral |
Velpatasvira,b | Headache; fatigue; nausea | Oral |
Eluxadolinea,b | Constipation; nausea; fatigue, bronchitis, viral gastroenteritis; pancreatitis | Oral |
Ledipasvira | Fatigue; headache | Oral |
The flavonoid glycosides diosmin (Figure 1B) and hesperidin (Figure 1E), obtained from citrus fruits, fit very well into and block the substrate binding site. Yet, these compounds cause mild adverse reactions (Table 4). Hesperidin hits showed up multiple times, suggesting it has many modes of binding (Figure 1A). Teniposide and etoposide (and its phosphate) are chemically related and turned up in multiple hits with good binding models (Figure 1F). However, these chemotherapy drugs have a lot of strong side effects and need intravenous administration (Table 4). The approved drug venetoclax (Figure 1C) and investigational drugs MK-3207 and R428 scored well in both screens. Venetoclax is another chemotherapy drug that is burdened by side effects including upper respiratory tract infection (Table 4). Not much has been disclosed about MK-3207 and R428.
We subjected the crystal structure to the same virtual screening procedures. A very similar list of candidates showed up consistently (Extended data7, Table S2) with high scores although ledipasvir was not found.
We noticed that most of the compounds on the list have molecular weights (MW) over 500, except lumacaftor (MW=452). The largest one is ledipasvir (MW=889). This is because the size of the peptide substrate and the deeply buried protease active site demand a large molecule that has many rotatable dynamics to fit into it.
We identified five trials on ClinicalTrials.gov involving antiviral and immunomodulatory drug treatments for SARS (Table 5), all without reported results; i.e., at present, there are no safe and effective drug candidates against SARS-CoV. This is because once the epidemic is over, there are no patients to recruit for clinical trials. Only the study with streptokinase succeeded in completion of phase 3. It is disappointing that little progress in SARS drug development has been made in the past 17 years. After the 2003 outbreak, numerous inhibitors for the 3CLpro enzyme have been proposed16,17, yet no new drug candidates have succeeded to enter the clinical phase 1.
Drug | Condition | Phase | Status | From | To | Location |
---|---|---|---|---|---|---|
Lopinavir / Ritonavir + Ribavirin | SARS | Unknown | Unknown | Hong Kong | ||
Alferon LDO | SARS | Phase 2 | Completed | Nov 04 | Apr 06 | Hong Kong |
Poly-ICLC | Respiratory virusesa | Phase 1 | Completed | Mar 08 | Dec 09 | USA |
Streptokinase | SARS, ARDS | Phase 3 | Completed | Feb 16 | Jan 18 | |
Glucocorticoid (methylprednisolone) therapy | Coronavirus infectionsb | Phase 2, Phase 3 | Unknown | Jan 20 | Dec 20 (Est.) | China |
One record which receives a lot of attention amid the current outbreak is the lopinavir/ritonavir combination18. They are protease inhibitors originally developed against HIV. During the 2003 SARS outbreak, despite lacking a clinical trial, they were tried as an emergency measure and found to offer improved clinical outcome18. However, some scientists did express scepticism19. By analogy, these compounds were speculated to act on SARS-CoV 3CLpro specifically, but there is as yet no crystal structure to support that, although docking studies were carried out to propose various binding modes20–23. The IC50 value of lopinavir is 50 μM (Ki = 14 μM) and that for ritonavir cannot be established24. Although this is far from a cure, based on our results that the two CoV 3CLpro enzymes are identical as far as protein sequences and substrate specificities are concerned, we are of the opinion that this is still one of the recommended routes for immediate treatment at the time of writing (early February 2020).
If we look beyond the 3CLpro, an earlier screen produced 27 candidates that could be repurposed against both SARS-CoV and MERS-CoV25. In addition, the other coronaviral proteins could be targeted for screening. Treatment of the COVID-19 with remdesivir (a repurposed drug in development targeting the RdRp) showing improved clinical outcome has just been reported and clinical trial is now underway26.
We consider this work part of the global efforts responding in a timely fashion to fight this deadly communicable disease. We are aware that there are similar modelling, screening and repurposing exercises targeting 3CLpro reported or announced20,27–33. Our methods did not overlap, and we share no common results with these studies.
The 11 SARS-CoV-2 polyprotein PP1AB and 3CLpro sequences used in this study were obtained from NCBI GenBank, accession numbers MN908947 , MN938384, MN975262, MN985325, MN988668, MN988669, MN988713, MN994467, MN994468, MN996527 and MN996528, available on 1 February 2020.
The SARS-CoV PP1AB sequence wsa obtained from NCBI Protein, accession number ABI96956.
The two coronavirus protease structures used were obtained from Protein Data Bank, ID 2DUC and 6LU7.
Open Science Framework: SARS-CoV-2 (2019-nCoV) 3CLpro Model and Screening. https://doi.org/10.17605/OSF.IO/HCU8X7.
The “Virtual Screening” folder contains the following extended data:
2019-nCoV-3CLpro.pdb. (3D model of the 3CLpro: A and B chains.)
A-screen4500.pdbqt, B-screen4500.pdbqt, X-screen4500.pdbqt. (Virtual screening 3D results of Model A chain, Model B chain and the crystal-structure (A chain) in PDBQT format (can be viewed by any text editor). Use the software PyMOL to open these files. Each result file contains 4500 drug-to-protein docking hits ranked by AutoDock Vina binding energies in kcal mol-1.)
A-screen1500.table.csv, B-screen1500.table.csv, X-screen1500.table.csv. (Virtual screening results (names only) of Model A chain, Model B chain and the crystal-structure (A chain) in CSV format (can be opened by Excel or any text editor). This is a summary of the top 1500 drug-to-protein docking hits ranked by AutoDock Vina binding energies in kcal mol-1.)
The “Extended Results” folder contains the following extended data:
Tab S1.docx (Sequence homology of the 3CLpro cleavage junctions of PP1AB between SARS-CoV-2 and SARS-CoV).
Tab S2.docx (The results of virtual screening of drugs on the active site of SARS-CoV-2 3CLpro crystal structure).
Fig S1.pptx (The structural model of the SARS-CoV-2 3CLpro protease).
Compare Crystal.docx (A comparison, with Figure S2, of the active sites of model chains A, B and the crystal structure).
Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Partly
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Molecular genetics, Clinical haematology, Investigational New Drug, Clinical Trial.
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Medicinal Chemistry, Drug Discovery, Chemical Biology
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Partly
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Molecular Dynamics Simulation; Computer-aided Drug Design
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |||
---|---|---|---|
1 | 2 | 3 | |
Version 2 (revision) 09 Apr 20 |
|||
Version 1 21 Feb 20 |
read | read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Lin et al. “Anti-SARS coronavirus 3C-like protease effects of Isatis indigotica root and plant-derived phenolic compounds”, 2005
Also, glad to see well-known coronavirus researcher Eric de Clercq having himself noted, way back in 2006 already, the efficacy of hesperidin in the same 2005 China Medical University (Taichung) in vitro study.
E. de Clerq, "Potential antivirals and antiviral strategies against SARS coronavirus infections", 2006
Lin et al. “Anti-SARS coronavirus 3C-like protease effects of Isatis indigotica root and plant-derived phenolic compounds”, 2005
Also, glad to see well-known coronavirus researcher Eric de Clercq having himself noted, way back in 2006 already, the efficacy of hesperidin in the same 2005 China Medical University (Taichung) in vitro study.
E. de Clerq, "Potential antivirals and antiviral strategies against SARS coronavirus infections", 2006