Skip to main content

ORIGINAL RESEARCH article

Front. Microbiol., 28 February 2022
Sec. Systems Microbiology

Discovery of Potential Therapeutic Drugs for COVID-19 Through Logistic Matrix Factorization With Kernel Diffusion

\r\nXiongfei TianXiongfei Tian1Ling ShenLing Shen1Pengfei GaoPengfei Gao2Li Huang,Li Huang3,4Guangyi LiuGuangyi Liu1Liqian Zhou*Liqian Zhou1*Lihong Peng,*Lihong Peng1,2*
  • 1School of Computer Science, Hunan University of Technology, Zhuzhou, China
  • 2College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China
  • 3Academy of Arts and Design, Tsinghua University, Beijing, China
  • 4The Future Laboratory, Tsinghua University, Beijing, China

Coronavirus disease 2019 (COVID-19) is rapidly spreading. Researchers around the world are dedicated to finding the treatment clues for COVID-19. Drug repositioning, as a rapid and cost-effective way for finding therapeutic options from available FDA-approved drugs, has been applied to drug discovery for COVID-19. In this study, we develop a novel drug repositioning method (VDA-KLMF) to prioritize possible anti-SARS-CoV-2 drugs integrating virus sequences, drug chemical structures, known Virus-Drug Associations, and Logistic Matrix Factorization with Kernel diffusion. First, Gaussian kernels of viruses and drugs are built based on known VDAs and nearest neighbors. Second, sequence similarity kernel of viruses and chemical structure similarity kernel of drugs are constructed based on biological features and an identity matrix. Third, Gaussian kernel and similarity kernel are diffused. Forth, a logistic matrix factorization model with kernel diffusion is proposed to identify potential anti-SARS-CoV-2 drugs. Finally, molecular dockings between the inferred antiviral drugs and the junction of SARS-CoV-2 spike protein-ACE2 interface are implemented to investigate the binding abilities between them. VDA-KLMF is compared with two state-of-the-art VDA prediction models (VDA-KATZ and VDA-RWR) and three classical association prediction methods (NGRHMDA, LRLSHMDA, and NRLMF) based on 5-fold cross validations on viruses, drugs, and VDAs on three datasets. It obtains the best recalls, AUCs, and AUPRs, significantly outperforming other five methods under the three different cross validations. We observe that four chemical agents coming together on any two datasets, that is, remdesivir, ribavirin, nitazoxanide, and emetine, may be the clues of treatment for COVID-19. The docking results suggest that the key residues K353 and G496 may affect the binding energies and dynamics between the inferred anti-SARS-CoV-2 chemical agents and the junction of the spike protein-ACE2 interface. Integrating various biological data, Gaussian kernel, similarity kernel, and logistic matrix factorization with kernel diffusion, this work demonstrates that a few chemical agents may assist in drug discovery for COVID-19.

Introduction

A novel coronavirus disease named COVID-19, caused by coronavirus SARS-CoV-2, is spreading around the globe. As of 3 December 2021, more than 263 million confirmed cases of SARS-CoV-2 infection and 5,232 thousand confirmed cases of SARS-CoV-2-caused death have been reported (WHO, 2021). The rapid transmission of SARS-CoV-2 has become a severe threat to public health worldwide (Baker et al., 2020; Hopman et al., 2020; Khan M. T. et al., 2021). Although its vaccines have been studied (Li et al., 2021), it is an immediate urgency to find promising antiviral drugs for COVID-19 therapies (Mahdian et al., 2020; Saxena, 2020).

However, under such an urgent situation, it is almost impossible to research and develop a new drug for patients with the COVID-19 infections since designing a new drug may spend more than 10 years (Liu et al., 2020; Yang et al., 2020). It might be an effective alternative to find possible therapeutic clues from Food and Drug Administration (FDA)-approved drugs, that is, drug repurposing (Liu et al., 2016; Yang et al., 2016; Chu et al., 2020; Masoudi-Sobhanzadeh, 2020; Zhang et al., 2020, 2021). Now, researchers worldwide have focused on repositioning the FDA-approved drugs for COVID-19. Since these drugs have been tested for the efficacy, safety, and toxicity in the clinical trials, they can be fast applied as clinically available drugs against COVID-19 (Wu et al., 2020). Multiple examples of repositioned drugs, such as antiviral drugs and host-targeting treatment, are or have been clinical trials for COVID-19 (Tang et al., 2020). Computational methods for identifying potential options against COVID-19 can be categorized into structure-based virtual screening methods (Khan M. T. et al., 2021) and network-based methods (Dotolo et al., 2020).

To capture possible antiviral drugs against SARS-CoV-2, a vast amount of structure-based virtual screening methods are carried out. The type of methods uses molecular docking and dynamics simulation techniques to measure binding capabilities between potential anti-COVID-19 drugs and targets. For example, Elfiky (2020) and Muralidharan et al. (2020) combined molecular docking and molecular dynamics simulation. Islam et al. (2021) integrated docking with two approaches, molecular dynamics simulation, and in silico absorption, distribution, metabolism, excretion, and toxicity (ADMET) profile. Kandeel et al. (2020) applied molecular docking, molecular dynamics simulation of top 10 hits, and free energy calculation. Khan et al. (2020a) designed an integrated computational framework for key residue identification via an alanine scanning strategy and an extensive simulation, a cryo-EM structure for novel drug identification based on computational virtual screening and molecular docking (Khan et al., 2020b), a multi-step drug screening method to shortlist potential drugs (Khan S. et al., 2021), and a structural and biomolecular simulation technique for revealing the impact of specific mutations in the B.1.617 variant (Khan A. et al., 2021). Wang et al. (2020) detected inhibition affect of human defensin-5 against SARS-CoV-2 invasion combining molecular dynamics simulation and statistical analysis. Elmezayen et al. (2020) used molecular docking for top-ranked compounds, molecular dynamics simulations, ADMET profile prediction, and free energy computation. Wang C. et al. (2021) found a versatile antimicrobial peptide that can be used as an inhibitor of SARS-CoV-2 attachment based on dual mechanisms.

Network-based methodologies are widely applied to drug repositioning by integrating multiple data sources. In these methods, nodes denote drugs, diseases, or targets, while edges denote interactions or associations between nodes. Network-based methods contain network-based clustering methods and network-based propagation methods (Messina et al., 2020; Sadegh et al., 2020). Network-based clustering methods have been developed to find novel drug-target interactions or drug-disease associations by finding biological modules (for example, drug-target, drug-disease, drug-drug) using clustering algorithms. Network-based propagation methods used network proximity and network propagation algorithms to model associations between drugs, targets, and COVID-19-related diseases. For example, Peng et al. (2020) and Zhou et al. (2020) separately used bipartite local model and the KATZ measurement to find potentially suitable drugs against COVID-19 and validated the predicted results by molecular docking and recent publications. Meng et al. (2021) proposed a similarity constrained probabilistic matrix factorization method to find new Virus-Drug Associations (VDAs). Fiscon et al. (2021) developed a searching off-label drug and network method to uncover interactions between targets and disease-specific proteins. Based on the above studies, (Peng et al., 2021) developed a random walk with restart-based VDA prediction model to discover possible anti-SARS-CoV-2 drugs on the constructed three VDA datasets. These methods effectively discovered possible antiviral drugs for the treatment of COVID-19.

In this study, we develop a novel VDA prediction method, VDA-KLMF, to find potential chemical agents for COVID-19. VDA-KLMF integrates virus sequences, drug chemical structures, known VDAs, Gaussian kernel, similarity kernel, and Logistic Matrix Factorization with Kernel diffusion. VDA-KLMF is compared with two state-of-the-art VDA prediction models [VDA-KATZ (Zhou et al., 2020) and VDA-RWR (Peng et al., 2021)] and three classical association identification models [NGRHMDA (Huang et al., 2017), LRLSHMDA (Wang et al., 2017), and NRLMF (Liu et al., 2016)] based on fivefold cross validations on viruses, drugs, and VDAs on three VDA datasets. Experimental results show that VDA-KLMF computes the optimal recalls, AUCs, and AUPRs, significantly improving VDA identification performance. Four chemical agents (remdesivir, ribavirin, nitazoxanide, and emetine) coming together on any two VDA datasets are inferred to be underlying anti-COVID-19 drugs.

Molecular docking is an important drug discovery tool applied to find the best appropriate intermolecular binding between a chemical agent and a target or two proteins. It can effectively elucidate fundamental biochemical processes and characterize activity of ligands binding target proteins (McConkey et al., 2002). In this manuscript, a molecule docking software, AutoDock (Morris et al., 2009), is used to measure the molecular activities of the predicted four antiviral small molecules at the junction of the SARS-Cov-2 Spike (S) protein-angiotensin-converting enzyme 2 (ACE2) interface. The dockings show that the four drugs have higher binding energies with two key residues (K353 and G496).

Materials and Methods

Materials

Three VDA datasets were provided by Peng et al. (2021). Dataset 1 contains 96 VDAs from 11 viruses and 78 drugs. Dataset 2 contains 770 VDAs from 69 viruses to 128 drugs. Dataset 3 contains 407 VDAs from 34 viruses and 203 drugs. The virus sequences and drug chemical structures can be downloaded from the NCBI (Sayers et al., 2021) and DrugBank (Wishart et al., 2018) databases, respectively. Virus sequence similarity matrix Sv and drug chemical structure similarity matrix Sd can be computed by MAFFT (Katoh et al., 2019) and RDKit (Landrum, 2021), respectively. The details are shown in Table 1.

TABLE 1
www.frontiersin.org

Table 1. Statistics for three VDA networks.

All virus-drug pairs in a dataset can be characterized as a matrix Y:

Yij={1ifviassociateswithdj0otherwise(1)

where vi and dj represent the ith virus and jth drug, respectively.

Problem Formalization

Given virus similarity matrix Sv, drug similarity matrix Sd, and VDA matrix Y, our task is to quantify the interplays between viruses and drugs, which can be divided into four scenarios: (1) known virus-known drug association, that is, a virus associates with no less than one drug and a drug associates with no less than one virus; (2) known virus-new drug association, that is, a virus interacts with at least one drug and a new drug does not interact with any virus; (3) new virus-known drug association, that is, a new virus does not associate with any drug and a drug interacts with at least one virus; (4) new virus-new drug association, that is, both virus and drug have no any association information. Our goal is to exploit a novel model to boost the VDA prediction performance. In particular, the model assigns an association probability to a virus-drug pair to measure the likelihood of interplay between the virus and the drug. The higher the probability is, the more likely the virus and the drug are associated with each other. Figure 1 illustrates the flowchart of the VDA-KLMF model.

FIGURE 1
www.frontiersin.org

Figure 1. The flowchart of the VDA-KLMF framework.

Gaussian Kernel Construction

SARS-CoV-2 is a new single strand RNA virus and has no any associated drug. That is, there may exist the scenario of new virus (for example, SARS-CoV-2) and new drug when a VDA dataset is split during cross validations. The nearest neighbor information of a virus/drug contributes to prioritizing VDAs related to the virus/drug. To find interacting drugs for a virus vi, its Gaussian kernel is constructed as follows.

First, its association profile is computed based on its nearest neighbor information by Eq. (2):

Yva(i,j)=kϵniY(k,j)×Sv(i,k)kϵni(2)

where ni represents nearest neighbors of vi, and k is a hyper-parameter and denotes the number of nearest neighbors of vi.

Second, the computed association profile is normalized by Eq. (3):

Yvn(i,j)=Yva(i,j)kϵniSv(i,k)kϵni(3)

Finally, Gaussian kernel KvGIP of viruses is calculated via the normalized association profiles by Eq. (4):

KvGIP(vi,vj)=exp(-||Yvn(i,k)-Yvn(i,j)||2σ)(4)

where σ is the kernel bandwidth. Similarly, Gaussian kernel KdGIP of drugs can be computed.

Similarity Kernel Construction

Sequence information of viruses and chemical structure information of drugs help VDA candidate screening. To comprehensively consider these data, original two similarity matrices are transformed into two kernel matrices (Kvsym and Kdsym). First, original virus similarity matrix is converted to a symmetric matrix Svsym by Eq. (5):

Svsym=12(Sv+SvT)(5)

Then, the symmetrized matrix Svsym is transformed into a positive semi-definite matrix Kvsym by Eq. (6):

Kvsym=Svsym+εI(6)

where I is an identity matrix, and ε is a parameter. Similarly, Kdsym can be calculated.

Similarity Diffusion

For a virus, its Gaussian kernel only depicts the similarities between the virus and its k nearest neighbors, the remaining information is discarded. To characterize virus features, inspired by a kernel technique proposed by Hao et al. (2016), we diffused two different types of virus similarity into a final kernel matrix.

First, the local virus similarity matrices are built based on Gaussian kernel and similarity kernel by Eqs (7) and (8), respectively:

LvGIP(i,j)={KvGIP(i,j)kϵniKvGIP(i,k)jϵni0otherwise(7)
Lvseq(i,j)={Kvsym(i,j)kϵniKvsym(i,k)jϵni0otherwise(8)

Second, the global virus similarity matrices GvGIP and Gvseqare produced by iteratively updating by Eqs. (9) and (10).

GvGIP(h+1)=LvGIPGvseq(h)(LvGIP)T(9)
Gvseq(h+1)=LvseqGvGIP(h)(Lvseq)T(10)

where GvGIP(h+1) and Gvseq(h+1) represent global Gaussian kernel and similarity kernel matrices generated at h-th iteration, respectively. And GvGIP(1)=Kvsym and Gvseq(1)=KvGIP.

Finally, virus similarity matrix Mv is integrated by Eq. (11):

Mv=12(GvGIP(h+1)+Gvseq(h+1))(11)

Similarly, drug similarity matrix Md can be computed.

Methods

After the diffused virus similarity matrix Mv and drug similarity matrix Md are computed, a Logistic Matrix Factorization model (VDA-KLMF) with Kernel diffusion (Liu et al., 2020) is then designed for VDA discovery. Viruses and drugs are first randomly mapped into two latent vector spaces AϵRm×r and BϵRn×r with the dimension of r. And association probability for each virus-drug pair can be calculated by Eq. (12):

P=exp(αABT+βMvABT+γABTMd)1+exp(αABT+βMvABT+γABTMd)(12)

where α, β, and γ are smoothing coefficients with the summation of 1, BT denotes the transpose of B. Inspired by the method provided by Liu et al. (2016), under the assumption that all samples are independent, interplays between viruses and drugs can be rewritten by assigning each known VDA as a confident value of c by Eq. (13):

p(Y|A,B)=i=1mj=1nPijcYij(1-Pij)1-Yij(13)

where Pij denotes association probability between the i-th virus and the j-th drug. Known VDAs are validated by wet experiments and more reliable, therefore, c is assigned as a higher value. Assume that the two vectors follow the zero-mean spherical Gaussian distribution defined by Eqs. (14) and (15):

p(A|σv2)=i=1mN(ai|0,σv2I)(14)
p(B|σd2)=j=1nN(bj|0,σd2I)(15)

where σv2 and σd2 are two parameters used to control the variances of Gaussian distribution, ai and bj refer to potential variables for the i-th virus and the j-th drug, respectively. I is an identity matrix. We can obtain the following distribution based on the Bayesian inference by Eq. (16):

p(A,B|Y,σv2,σd2)p(Y|A,B)p(A|σv2)p(B|σd2)(16)

The log formula of the posterior distribution can be represented as Eq. (17):

lnp(A,B|Y,σv2,σd2)=i,j(cYο(αABT+βMVABT+γABTMD)-(1+cY-Y)οln[1+exp(αABT+βMVABT+γABTMD)])-λA2i=1m||ai||22-λB2j=1n||bj||22+C(17)

where λA=1σv2, λB=1σd2, ||||22 represents the spectral norm, and ο denotes the Hadamard product. Thus the latent variable virus matrix A and drug matrix B can be generated by maximizing an objective function defined by Eq. (18):

maxA,Bi,j(cYο(αABT+βMVABT+γABTMD)-(1+cY-Y)οln[1+exp((αABT+βMVABT+γABTMD))]-λA2||A||F2-λB2||B||F2(18)

where ||||F2 represents the Frobenius norm.

According to the work provided by Liu et al. (2016), A and B can be solved by Eqs. (19) and (20):

LLA=c(αI+βMvT)YB+γ(cY-R)MdTB-(αI+βMvT)RB-λAA(19)
LLB=c(αI+γMd)YTA+β(cYT-RT)MvA-(αI+γMd)RTA-λAB(20)

where R=(1+cY-Y)ο11+exp(-(αABT+βMvABT+γABTMd)), A and B can be updated based on the AdaGrad algorithm (Duchi et al., 2011).

Molecular Docking

Molecular docking is utilized to measure dynamics and binding energies between the predicted antiviral compounds against SARS-CoV-2 and the junction of the S protein-ACE2 interface. Similar to the molecular docking process provided by Peng et al. (2021), we first downloaded structures of the S protein and ACE2 and chemical structures of drugs from the RCSB Protein Data Bank (Rose et al., 2016) and the DrugBank databases (Wishart et al., 2018), respectively. Second, solvent and organic compounds were removed and the receptor proteins were preprocessed based on PyMOL (Schrodinger, 2010). Third, atoms from receptors were set to the AD4 type. Finally, AutoDock was applied to implement molecular docking. During docking, the predicted anti-COVID-19 drugs was used as ligands and the junction of the S protein-ACE2 interface was taken as receptor. Binding pocket was set via AutoGrid4, the grid size was 126 × 126 × 126, and Lamarckian genetic algorithm was selected as the search method. The detailed processes were set the same as ones provided by Peng et al. (2021).

Results

Experimental Settings and Evaluation Metrics

We perform experiments to evaluate the performance of the VDA-KLMF method. Given a VDA matrix Yn×m between n viruses and m drugs, inspired by Cross Validation (CV) provided by Peng et al. (Peng et al., 2021), three different 5-fold CVs, CV on viruses (CV1), CV on drugs (CV2), and CV on virus-drug pairs (CV3), are implemented. Under CV1, in each round, 80% viruses are used to train VDA prediction models and the remaining 20% of viruses are used to test the performance of these models. Under CV2, in each round, 80% drugs are used to train VDA prediction models and the remaining 20% of drugs are used to test their performance. Under CV3, in each round, 80% VDAs are used to train VDA prediction models and the remaining 20% of VDAs are used to test their performance. The three CVs correspond to VDA prediction for a new virus, a new drug, or based on known VDA data.

The number of iterations h is set as 100. The confident level c of known VDA, the number of neighbors k, weights λA, and λB are set in the range of [3, 10], [1, 10], [1, 10], and [1, 10], respectively. We repeatedly implemented experiments for 100 times and used random search approach to select the optimal parameters. The optimal parameter combinations of VDA-KLMF and other five VDA prediction methods (NGRHMDA, LRLSHMDA, NRLMF, VDA-KATZ, and VDA-RWR) are shown in Table 2.

TABLE 2
www.frontiersin.org

Table 2. The optimal parameter combinations of six VDA prediction methods.

Recall (sensitivity), specificity, precision, F1 score, AUC, and AUPR are used to assess the performance of six VDA prediction approaches (VDA-KLMF, NGRHMDA, LRLSHMDA, NRLMF, VDA-KATZ, and VDA-RWR). Recall (sensitivity) indicates the ratio of correctly predicted positive VDAs to all known positive VDAs. Precision represents the ratio of correctly predicted VDAs to all predicted positive VDAs. Specificity denotes the ratio of correctly predicted negative VDAs to all known negative VDAs. F1 Score is the harmonic mean of recall and precision. The four evaluation metrics are defined as follows:

Recall=TPTP+FN(21)
Specificity=TNTN+FP(22)
Precision=TPTP+FP(23)
F1score=2TP2TP+FP+FN(24)

where TP, FP, TN and FN denote true positive, false positive, true negative and false negative, respectively. AUC is the average area under the Receiver Operating Characteristics (ROC) curve. The ROC curve is the plot of true positive ratio as a function of false positive ratio when the threshold to capture VDAs from the ranking varies. AUPR is the area under the Precision-Recall (PR) curve. The PR curve is the plot of true positive ratios among all predicted positive VDAs for each given recall value. AUPR provides a quantitative measurement of how well, on average, inferred association probabilities of positive VDAs are separated from the probabilities of negative VDAs. Higher recall, specificity, precision, F1 score, AUC and AUPR illustrate better performance. AUC and AUPR are two more important evaluation criterions compared to other four metrics.

Performance Evaluation Under Three Five-Fold Cross Validations

VDA-KLMF is compared with NGRHMDA (Huang et al., 2017), LRLSHMDA (Wang et al., 2017), NRLMF (Liu et al., 2016), VDA-KATZ (Zhou et al., 2020), and VDA-RWR (Peng et al., 2021). The former three methods are representative association prediction approaches. NGRHMDA fused collaborative filtering and graph-based scoring. LRLSHMDA utilized a Laplacian regularized least square classifier. NRLMF used a neighborhood regularized Logistic matrix factorization model. The remaining two methods are state-of-the-art VDA prediction models. The two methods used the KATZ measurement and random walk with restart to prioritize anti-SARS-CoV-2 drugs, respectively. The experiments are repeated for 20 times and the final performance is averaged for 20 times. The results are shown in Tables 35. The best performance obtained from the six VDA prediction methods in each dataset is denoted in bold in each column.

TABLE 3
www.frontiersin.org

Table 3. The performance of six VDA prediction methods on three datasets under CV1.

Table 3 lists the performance of six VDA identification models under CV1. It can be observed that VDA-KLMF computes the best recall, AUC, and AUPR, significantly outperforming NGRHMDA, LRLSHMDA, NRLMF, VDA-KATZ, and VDA-RWR on datasets 2 and 3. On dataset 1, VDA-KLMF calculates slightly lower recall, specificity, AUC, and AUPR than NGRHMDA and VDA-RWR. However, on dataset 2 and 3, VDA-KLMF obtains much better performance than the two approaches. It may be resulted in by small sample feature of dataset 1. The results demonstrate that abundant data can boost the prediction performance of VDA inference algorithms.

More importantly, the performance achieved by six VDA prediction models under CV1 is relatively lower than those of CV2 and CV3. The reason may be that there is a completely unknown virus in the three datasets, SARS-CoV-2, which does not show any associated drugs and thus decreases the prediction ability of these algorithms. Under the situation that few of any unlabeled drug for a new virus exists, VDA-KLMF can calculate the best AUCs of 0.8149 and 0.8224 and the best AUPRs of 0.3487 and 0.3431 on datasets 2 and 3, respectively. The result suggests that VDA-KLMF can be effectively applied to prioritize potential small molecules for a new virus, especially SARS-CoV-2.

Table 4 gives the performance of six VDA identification algorithms on the three VDA datasets under CV2. VDA-KLMF computes the best recall, F1 score, AUC and AUPR on all three datasets, much better than other five VDA techniques. For example, AUCs computed by VDA-KLMF are better 13.18, 13.98, 12.38, 12.28, and 4.65% than NGRHMDA, LRLSHMDA, NRLMF, VDA-KATZ, and VDA-RWR on dataset 1, respectively. It is better 7.23, 14.06, 8.92, 18.54, and 7.15% on dataset 2 and 24.13, 17.17, 13.38, 23.45, and 10.17% on dataset 3. AUPRs achieved from VDA-KLMF outperform 72.54, 48.01, 24.18, 40.32, and 63.52% compared to NGRHMDA, LRLSHMDA, NRLMF, VDA-KATZ, and VDA-RWR on dataset 1, respectively. Its performance outperforms 46.07, 41.00, 22.58, 40.88, and 47.14 on dataset 2 and 49.03, 46.31, 22.65, 42.90, and 42.52% on dataset 3. The comparative results demonstrate the superior prediction ability of VDA-KLMF for identifying possible viruses associated with a new drug.

TABLE 4
www.frontiersin.org

Table 4. The performance of six VDA prediction methods on three datasets under CV2.

Table 5 shows recall, specificity, precision, F1 score, AUC, AUPR computed by six VDA prediction models on the three datasets under CV3. It can be seen that VDA-KLMF still obtains the best performance in terms of recall and AUC on the three datasets. Under CV3, NRLMF computes the best precision and F1 score on all datasets and is the second-best method. In particular, compared to NRLMF, recall obtained by VDA-KLMF is better 24.42, 26.90, and 27.41% on datasets 1–3, respectively. AUCs calculated by VDA-KLMF are better 6.14, 4.22, and 2.89%, respectively. AUPRs achieved from VDA-KLMF are better 11.20, and 4.31% on datasets 1–2, respectively. The results suggest that VDA-KLMF can effectively improve VDA prediction performance based on known VDAs.

TABLE 5
www.frontiersin.org

Table 5. The performance of six VDA prediction methods on three datasets under CV3.

Under CV1, NGRHMDA calculates AUCs of 0.7026, 0.4301, and 0.4058 on three datasets, respectively. Under CV2, it computes AUCs of 0.8329, 0.8017, and 0.6772, respectively. Under CV3, it calculates AUCs of 0.6459, 0.3011, and 0.2554, respectively. Under CV1 and CV3, NGRHMDA computes AUCs smaller than 0.5 on datasets 2 and 3. In contrast, if we re-draw the ROC curve, it will obtain AUCs larger than 0.5 on the two datasets under CV1 and CV3. However, its computed AUCs will be smaller than 0.5 under CV2. Similarly, LRLSHMDA and VDA-KATZ compute AUCs smaller than 0.5 on three datasets under CV1, and ones larger than 0.5 under CV2 and CV3. In contrast, if we re-graph the ROC curve, the two methods will compute AUCs larger than 0.5 under CV1 and ones smaller than 0.5 under CV2 and CV3. It may be caused by their poor generalization ability.

In addition, VDA-KLMF computes the slightly smaller specificity. However, specificity indicates the ratio of correctly predicted negative VDAs to all known negative VDAs. For anti-COVID-19 drug screening, it is possible anti-COVID-19 drugs that we need to capture. Therefore, it is more significant to find correctly predicted positive VDAs than correctly predicted negative VDAs. That is, sensitivity (recall) and precision are much more important than specificity. More importantly, under majority of situations, VDA-KLMF computes better AUCs and AUPRs, demonstrating relatively strong VDA prediction performance of VDA-KLMF. Figures 2, 3 depict the AUC and AUPR values calculated by six VDA prediction algorithms on three datasets under three different CVs, respectively.

FIGURE 2
www.frontiersin.org

Figure 2. The AUC values predicted by six VDA prediction methods (D denotes dataset, Dl denotes dataset 1, D2 denotes dataset 2, D3 denotes dataset 3).

FIGURE 3
www.frontiersin.org

Figure 3. The AUPR values predicted by six VDA prediction methods (D denotes dataset, DI denotes dataset 1, D2 denotes dataset 2, D3 denotes dataset 3).

Effect of Gaussian Kernel on Virus-Drug Association Prediction Performance

In the VDA-KLMF model, logistic matrix factorization model with kernel diffusion integrates Gaussian kernel and biological similarity kernel including sequence similarity of viruses and chemical structure similarity of drugs. Gaussian kernel fully utilizes the nearest neighbor information of viruses and drugs. We investigated VDA prediction performance of logistic matrix factorization model considering kernel diffusion with Gaussian kernel and biological similarity kernel (VDA-KLMF) and only considering biological similarity (VDA-LMFB). The results are shown in Figure 4. From Figure 4, we can find that kernel diffusion contributes to improving VDA identification ability.

FIGURE 4
www.frontiersin.org

Figure 4. Effect of Gaussian kernel on virus-drug association prediction performance.

Effect of Different r-Values on the Performance of Virus-Drug Associations-Logistic Matrix Factorization Model

In the VDA-KLMF model, viruses and drugs are randomly mapped into two latent vector spaces AϵRm×r and BϵRn×r with the dimension of r. To evaluate the effect of different r-values on the prediction performance, we compared the performance of VDA-KLMF under different settings. Table 6 illustrates the comparison results of VDA-KLMF on three datasets under CV3. On dataset 1, we set r in the range of [2, 30] with the interval of 1. The results show that VDA-KLMF obtains the best prediction ability when r is set to 6. On datasets 2 and 3, we set r in the range of [5, 100] with the interval of 5. The results suggest that VDA-KLMF computes the best performance when r is set to 45 and 35, respectively. Therefore, the dimension r is set to 6, 45, and 35 on the three datasets, respectively.

TABLE 6
www.frontiersin.org

Table 6. The effect of different r on VDA-KLMF on three datasets under CV3.

Case Study

We wanted to identify potential chemical agents for preventing COVID-19 after confirming the powerful prediction ability of VDA-KLMF. We prioritized the top 10 compounds associated with SARS-CoV-2 on the three datasets. The results are shown in Tables 79, respectively. Among the top 10 small molecules with the highest association rankings with SARS-CoV-2, the majority of anti-SARS-CoV-2 drugs have been validated by current literatures. The results in Tables 79 show that there are seven available anti-SARS-CoV-2 compounds coming together on any two datasets, that is, remdesivir, ribavirin, nitazoxanide, favipiravir, emetine, chloroquine, and mycophenolic acid.

TABLE 7
www.frontiersin.org

Table 7. The predicted top 10 drugs associated with SARS-CoV-2 on dataset 1.

TABLE 8
www.frontiersin.org

Table 8. The predicted top 10 drugs associated with SARS-CoV-2 on dataset 2.

TABLE 9
www.frontiersin.org

Table 9. The predicted top 10 drugs associated with SARS-CoV-2 on dataset 3.

Remdesivir is an adenosine triphosphate analogue. It has broad-spectrum antiviral activity and thus can be applied to the treatment of various diseases resulted in by the Arenaviridae, Flaviviridae, Filoviridae, Paramyxoviridae, Pneumoviridae, and Coronaviridae viral families (Malin et al., 2020). Remdesivir’s action against the Coronaviridae family makes it as a potential therapeutic strategy for COVID-19 (Gordon et al., 2020). On 19 November 2020, the drug in combination with baricitinib has been authorized to the treatment of COVID-19 (Eastman et al., 2020; FDA, 2021).

Ribavirin is a synthetic guanosine nucleoside (American et al., 2017). The small molecule can generate broad activity against a few RNA and DNA viruses by inhibiting the synthesis of viral mRNAs. It is widely applied to the treatment of hepatitis C and viral hemorrhagic fevers and might be effective in the early steps of viral hemorrhagic fevers (Myers et al., 2015; Wishart et al., 2018).

Nitazoxanide is a broad anti-infective compound. The drug can markedly modulate the survival, growth, and proliferation of various intracellular and extracellular protozoa, helminths, viruses, anaerobic and microaerophilic bacteria (Shakya et al., 2017). It can inhibit the replication of a few RNA and DNA viruses and has been investigated as a broad antiviral compound (Wishart et al., 2018).

Molecular Docking

We conducted molecular dockings for the predicted antiviral drugs and the junction of the S protein-ACE2 interface. The binding energies between the predicted top 10 antiviral drugs on three datasets and the junction are shown in Table 10. From Table 10, we can observe that the identified top small molecules show higher binding energies with the junction, where nitazoxanide, mycophenolic acid, and zanamivir have the highest binding abilities. In addition, the key residues between the predicted seven compounds coming together on any two datasets and the junction are K68 and Q493 for remdesivir, R403, Q493, K353, and G496 for ribavirin, Q493 and S494 for nitazoxanide, K353 and G496 for favipiravir, T500 for emetine, H34 for chloroquine, and H34, K353, F390, and G496 for mycophenolic acid, respectively. The results suggest that K353 and G496 are possible key residues between anti-SARS-CoV-2 drugs and the junction of the S protein-ACE2 interface.

TABLE 10
www.frontiersin.org

Table 10. Binding energy between the predicted antiviral drugs and the junction of the S protein-ACE2 interface.

Molecular dockings between the predicted four possible antiviral drugs against COVID-19 (remdesivir, ribavirin, nitazoxanide, and emetine) and the junction are illustrated in Figure 5, where two docking graphs [(a) remdesivir and (b) ribavirin] were provided by Peng et al. (2020). The subfigure in each circle denotes the residues at the junction and their corresponding orientations. Green denotes the structure of ACE2 and cyan denotes the SARS-CoV-2 S protein.

FIGURE 5
www.frontiersin.org

Figure 5. Molecular dockings between the predicted four possible antiviral drugs against COV1D-19 (remdesivir, ribavinn, nitazoxanide, and emetine) and the junction of the S protein-ACE2 interface. (A) remdesivir (Peng et al., 2021; Wang J. et al., 2021; Shen et al., 2022), (B) ribavirin (Peng et al., 2021; Wang J. et al., 2021; Shen et al., 2022), (C) nitazoxanide, and (D) emetine.

Discussion

Since the outbreak of COVID-19, we conducted several works for initially screening possible drugs applied to this highly contagious disease based on virus sequences, drug chemical structures, and observed VDAs from existing data resources. These works include VDA-RLSBN (Peng et al., 2020), VDA-RWR (Peng et al., 2021), and the proposed VDA-KLMF methods. VDA-RLSBN and VDA-RWR first utilized complete genomic sequences of viruses and chemical structures of drugs. Second, they developed computational models to detect underlying associations between SARS-CoV-2 and small molecules. Finally, they conducted molecular dockings between the predicted anti-COVID-19 drugs and two target proteins including the S protein and ACE2 to measure their binding ability. The two methods effectively captured possible antiviral drugs against COVID-19.

In particular, VDA-KLMF integrates drug chemical structures, virus sequences, known VDAs, Gaussian kernel, similarity kernel, and logistic matrix factorization with kernel diffusion. It is compared with two state-of-the-art VDA prediction models and three classical association inference methods. The experimental results illustrate that the proposed VDA-KLMF method obtains powerful prediction performance.

SARS-CoV-2 is a new virus, that is, an orphan node in a VDA network. It has no association with available drugs. To capture underlying FDA-approved drugs against SARS-CoV-2, VDA-KLMF computes sequence similarity between the virus and other viruses and obtains a similarity matrix with the elements in the range of (0,1). Based on sequence similarity kernel and Gaussian kernel, VDA-KLMF can predict association information for SARS-CoV-2 combining matrix factorization model with kernel diffusion. The results show that four small molecules, remdesivir, ribavirin, nitazoxanide, and emetine, have higher binding energies with the junction of the S protein-ACE2 interface.

VDA-KLMF computes superior prediction performance. It has the following three characteristics. First, it effectively integrates various biological information including global and local similarities of viruses and drugs. Second, logistic matrix factorization model with kernel diffusion more accurately quantifies the interplays between viruses and drugs. Finally, two key residues (K353 and G496) are found and need further medical validation.

Compared to VDA-KLMF, VDA-RLSBN remains the following four limitations: (i) Its prediction ability was only validated on one dataset comprised of 96 VDAs between 12 viruses and 78 drugs, which may possibly result in the overfitting problem. (ii) It was only evaluated under CV3 and failed to measure the performance under CVs on viruses and drugs, thereby failures to investigate its generalization ability. (iii) It found 10 potential small molecules against COVID-19 from 78 FDA-approved drugs on the constructed small dataset. Drugs that may be applied to screen the clues of treatment for patients with the infection of COVID-19 are relatively few ones. (iv) It implemented molecular dockings between the identified small molecules and the target proteins including the S protein and ACE2, respectively. In comparison, our proposed VDA-KLMF method use three datasets and is evaluated under CVs on viruses, drugs and VDAs. In this context, VDA-KLMF obtains better performance, thereby demonstrating its powerful generalization ability. Moreover, VDA-KLMF screens possible anti-COVID-19 drugs coming together in any two datasets and the inferred results may be more reliable than those from unique dataset. Finally, VDA-KLMF conducts molecular dockings between the screened drugs and the junction of the S protein-ACE2 interface, which can more reasonably measure their binding abilities.

Similar to VDA-KLMF, VDA-RWR was also measured under three CVs on three datasets. AUC and AUPR are two more important evaluation metrics compared to recall, precision, specificity, and F1 score. VDA-KLMF significantly outperforms VDA-RWR under the above situations. The results illustrate that VDA-KLMF can more precisely screen potential drugs against COVID-19, while further accurately prioritizing possible small molecules during the initial drug screening is vital to the treatment of COVID-19. More importantly, VDA-KLMF captures two candidate drugs (nitazoxanide and emetine) except remdesivir and ribavirin and provides more choices to initially screen available compounds against COVID-19.

To better uncover potential therapeutic clues for COVID-19 and similar diseases produced by evolving SARS-CoV-2, in the future, first, we will build a bigger and SARS-CoV-2-related database comprised of drugs, disease, and targets. Second, abundant biological data related to single strand RNA viruses should be integrated to more accurately depict biological features of viruses and drugs. Finally, a more robust model, for example, deep learning model, should be built to boost VDA identification performance. We anticipate that this work can contribute to the initial drug screening for therapy of patients with the infection of COVID-19.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Author Contributions

XT, LP, and LZ: conceptualization. XT and LP: methodology and writing—review and editing. XT and LH: software. XT, LS, PG, and GL: validation. LP and LZ: investigation, supervision, project administration, and funding acquisition. XT, LS, and GL: data curation. LP: writing—original draft preparation. LS: visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant Nos. 62072172, 61803151, and 62172158), the Natural Science Foundation of Hunan province (Grant No. 2021JJ30219), the Scientific Research Project of Hunan Provincial Department of Education (Grant No. 20C0636), and the Scientific Research and Innovation Foundation of Hunan University of Technology (Grant No. CX2031).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We really appreciate reviewers for the valuable comments. We would like to thank all authors of the cited references.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2022.740382/full#supplementary-material

References

American Association for the Study of Liver Diseases, and Infectious Diseases Society of America (2017). HCV Guidance. Available online at: http://hcvguidelines.org. (accessed June 12, 2017)

Google Scholar

Baker, P., White, A., and Morgan, R. (2020). Men’s health: COVID-19 pandemic highlights need for overdue policy action. Lancet 395, 1886–1888. doi: 10.1016/S0140-6736(20)31303-9

CrossRef Full Text | Google Scholar

Chu, Y. Y., Shan, X. Q., Chen, T. H., Jiang, M. M., Wang, Y. J., Wang, Q. K., et al. (2020). DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Br. Bioinform. 22:bbaa205. doi: 10.1093/bib/bbaa205

PubMed Abstract | CrossRef Full Text | Google Scholar

Dotolo, S., Marabotti, A., Facchiano, A., and Roberto, T. (2020). A review on drug repurposing applicable to COVID-19. Br. Bioinform. 22, 726–741. doi: 10.1093/bib/bbaa288

PubMed Abstract | CrossRef Full Text | Google Scholar

Duchi, J., Hazan, E., and Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159.

Google Scholar

Eastman, R. T., Roth, J. S., Brimacombe, K. R., Simeonov, A., Shen, M., Patnaik, S., et al. (2020). Remdesivir: a review of its discovery and development leading to emergency use authorization for treatment of COVID-19. ACS Central Sci. 6, 672–683. doi: 10.1021/acscentsci.0c00489

PubMed Abstract | CrossRef Full Text | Google Scholar

Elfiky, A. A. (2020). Anti-HCV, nucleotide inhibitors, repurposing against COVID-19. Life Sci. 248:117477. doi: 10.1016/j.lfs.2020.117477

PubMed Abstract | CrossRef Full Text | Google Scholar

Elmezayen, A. D., Al-Obaidi, A., Şahin, A. T., and Yelekci, K. (2020). Drug repurposing for coronavirus (COVID-19): in silico screening of known drugs against coronavirus 3CL hydrolase and protease enzymes. J. Biomol. Struct. Dyna. 39, 2980–2992. doi: 10.1080/07391102.2020.1758791

PubMed Abstract | CrossRef Full Text | Google Scholar

FDA (2021). Available online at: https://www.fda.gov/media/143822/download. (accessed July 28, 2021)

Google Scholar

Fiscon, G., Conte, F., Farina, L., and Paci, P. (2021). SAveRUNNER: a network-based algorithm for drug repurposing and its application to COVID-19. PLoS Comput. Biol. 17:e1008686. doi: 10.1371/journal.pcbi.1008686

PubMed Abstract | CrossRef Full Text | Google Scholar

Gordon, C. J., Tchesnokov, E. P., Woolner, E., Perry, J. K., Feng, J. Y., Porter, D. P., et al. (2020). Remdesivir is a direct-acting antiviral that inhibits RNA-dependent RNA polymerase from severe acute respiratory syndrome coronavirus 2 with high potency. J. Biol. Chem. 295, 6785–6797. doi: 10.1074/jbc.RA120.013679

PubMed Abstract | CrossRef Full Text | Google Scholar

Hao, M., Wang, Y., and Bryant, S. H. (2016). Improved prediction of drug-target interactions using regularized least squares integrating with kernel fusion technique. Anal. Chim. Acta 909, 41–50. doi: 10.1016/j.aca.2016.01.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Hopman, J., Allegranzi, B., and Mehtar, S. (2020). Managing COVID-19 in low-and middle-income countries. JAMA 323, 1549–1550. doi: 10.1001/jama.2020.4169

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, Y. A., You, Z. H., Chen, X., Huang, Z. A., Zhang, S. W., and Yan, G. Y. (2017). Prediction of microbe-disease association from the integration of neighbor and graph with collaborative recommendation model. J. Trans. Med. 15:209. doi: 10.1186/s12967-017-1304-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Islam, R., Parves, M. R., Paul, A. S., Uddin, N., Rahman, M. S., Mamun, A. A., et al. (2021). A molecular modeling approach to identify effective antiviral phytochemicals against the main protease of SARS-CoV-2. J. Biomol. Struct. Dyna. 39, 3213–3224. doi: 10.1080/07391102.2020.1761883

PubMed Abstract | CrossRef Full Text | Google Scholar

Kandeel, M., Abdelrahman, A. H. M., Oh-Hashi, K., Ibrahim, A., Venugopala, K. N., Morsy, M. A., et al. (2020). Repurposing of FDA-approved antivirals, antibiotics, anthelmintics, antioxidants, and cell protectives against SARS-CoV-2 papain-like protease. J. Biomol. Struct. Dyna. 2020:1784291. doi: 10.1080/07391102.2020.1784291

PubMed Abstract | CrossRef Full Text | Google Scholar

Katoh, K., Rozewicki, J., and Yamada, K. D. (2019). MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Br. Bioinform. 20, 1160–1166. doi: 10.1093/bib/bbx108

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, A., Khan, M., Saleem, S., Junaid, M., Ali, A., Ali, S. S., et al. (2020a). Structural insights into the mechanism of RNA recognition by the N-terminal RNA-binding domain of the SARS-CoV-2 nucleocapsid phosphoprotein. Comput. Struct. Biotechnol. J. 18, 2174–2184. doi: 10.1016/j.csbj.2020.08.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, A., Khan, M. T., Saleem, S., Babar, Z., Ali, A., Khan, A. A., et al. (2020b). Phylogenetic analysis and structural perspectives of RNA-dependent RNA-polymerase inhibition from SARs-CoV-2 with natural products. Interdisciplinary Sci. Comput. Life Sci. 12, 335–348. doi: 10.1007/s12539-020-00381-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, A., Wei, D. Q., and Kousar, K. (2021). Preliminary structural data revealed that the SARS-CoV-2 B. 1.617 Variant’s RBD binds to ACE2 receptor stronger than the wild type to enhance the infectivity. ChemBioChem 22:2641. doi: 10.1002/cbic.202100191

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, M. T., Irfan, M., Ahsan, H., Ahmed, A., Kaushik, A. C., Khan, A. S., et al. (2021). Structures of SARS-CoV-2 RNA-binding proteins and therapeutic targets. Intervirology 64, 55–68. doi: 10.1159/000513686

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, S., Hussain, Z., Safdar, M., Khan, A., and Wei, D. Q. (2021). Targeting the N-terminal domain of the RNA-binding protein of the SARS-CoV-2 with high affinity natural compounds to abrogate the protein-RNA interaction: a molecular dynamics study. J. Biomol. Struct. Dyna. 2021:1882337. doi: 10.1080/07391102.2021.1882337

PubMed Abstract | CrossRef Full Text | Google Scholar

Landrum, G. (2021). RDKit: Open-Source Cheminformatics Software. Available online at: https://www.rdkit.org. (accessed March 01, 2021)

Google Scholar

Li, T. B., Huang, T., Guo, C., Wang, A. L., Shi, X. L., Mo, X. F., et al. (2021). Genomic variation, origin tracing, and vaccine development of SARS-CoV-2: A systematic review. Innovation 2:100116. doi: 10.1016/j.xinn.2021.100116

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, H., Ren, G., and Chen, H. (2020). Predicting lncRNA–miRNA interactions based on logistic matrix factorization with neighborhood regularized. Know. Based Syst. 191:105261.

Google Scholar

Liu, X. M., Yang, J. S., Zhang, Y., Fang, Y., Wang, F. Y., Wang, J., et al. (2016). A systematic study on drug-response associated genes using baseline gene expressions of the cancer cell line encyclopedia. Sci. Rep. 6:22811. doi: 10.1038/srep22811

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, Y., Wu, M., Miao, C. Y., Zhao, P. L., and Li, X. L. (2016). Neighborhood regularized logistic matrix factorization for drug-target interaction prediction. PLoS Comput. Biol. 12:e1004760. doi: 10.1371/journal.pcbi.1004760

PubMed Abstract | CrossRef Full Text | Google Scholar

Mahdian, S., Ebrahim-Habibi, A., and Zarrabi, M. (2020). Drug repurposing using computational methods to identify therapeutic options for COVID-19. J. Diab. Metab. Dis. 19, 691–699. doi: 10.1007/s40200-020-00546-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Malin, J. J., Suarez, I., Priesner, V., Fatkenheuer, G., and Rybniker, J. (2020). Remdesivir against COVID-19 and other viral diseases. Clin. Microbiol. Rev. 34:e00162-20. doi: 10.1128/CMR.00162-20

PubMed Abstract | CrossRef Full Text | Google Scholar

Masoudi-Sobhanzadeh, Y. (2020). Computational-based drug repurposing methods in COVID-19. BioImpacts 10:205. doi: 10.34172/bi.2020.25

PubMed Abstract | CrossRef Full Text | Google Scholar

McConkey, B. J., Sobolev, V., and Edelman, M. (2002). The performance of current methods in ligand-protein docking. Curr. Sci. 83, 845–856.

Google Scholar

Meng, Y. J., Jin, M., Tang, X., and Xu, J. L. (2021). Drug repositioning based on similarity constrained probabilistic matrix factorization: COVID-19 as a case study. Appl. Soft Comput. 103:107135. doi: 10.1016/j.asoc.2021.107135

PubMed Abstract | CrossRef Full Text | Google Scholar

Messina, F., Giombini, E., Agrati, C., Vairo, F., Bartoli, T. A., Moghazi, S. A., et al. (2020). COVID-19: viral-host interactome analyzed by network based-approach model to study pathogenesis of SARS-CoV-2 infection. J. Transl. Med. 18:233. doi: 10.1186/s12967-020-02405-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S., et al. (2009). AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791. doi: 10.1002/jcc.21256

PubMed Abstract | CrossRef Full Text | Google Scholar

Muralidharan, N., Sakthivel, R., Velmurugan, D., and Gromiha, M. M. (2020). Computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with SARS-CoV-2 Protease against COVID-19. J. Biomol. Struct. Dyna. 39, 2673–2678. doi: 10.1080/07391102.2020.1752802

PubMed Abstract | CrossRef Full Text | Google Scholar

Myers, R. P., Shah, H., Burak, K. W., Cooper, C., and Feld, J. J. (2015). An update on the management of chronic hepatitis C: 2015 consensus guidelines from the canadian association for the study of the liver. Can. J. Gastroent. Hepatol 29, 19–34. doi: 10.3969/j.issn.1001-5256.2015.04.003

CrossRef Full Text | Google Scholar

Peng, L. H., Shen, L., Xu, J. L., Tian, X. F., Liu, F. X., Wang, J. J., et al. (2021). Prioritizing antiviral drugs against SARS-CoV-2 by integrating viral complete genome sequences and drug chemical structures. Sci. Rep. 11:6248. doi: 10.1038/s41598-021-83737-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, L. H., Tian, X. F., Shen, L., Kuang, M., Li, T. B., Tian, G., et al. (2020). Identifying effective antiviral drugs against SARS-CoV-2 by drug repositioning through virus-drug association prediction. Front. Genet. 11:577387. doi: 10.3389/fgene.2020.577387

PubMed Abstract | CrossRef Full Text | Google Scholar

Rose, P. W., Prliæ, A., Altunkaya, A., Bi, C. X., Bradley, A. R., Christie, C. H., et al. (2016). The RCSB protein data bank: integrative view of protein, gene and 3D structural information. Nucleic Acids Res. 45, D271–D281. doi: 10.1093/nar/gkw1000

PubMed Abstract | CrossRef Full Text | Google Scholar

Sadegh, S., Matschinske, J., Blumenthal, D. B., Galindez, G., Kacprowski, T., List, M., et al. (2020). Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing. Nat. Commun. 11:3518. doi: 10.1038/s41467-020-17189-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Saxena, A. (2020). Drug targets for COVID-19 therapeutics: ongoing global efforts. J. Biosci. 45, 1–24. doi: 10.1007/s12038-020-00067-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Sayers, E. W., Cavanaugh, M., Clark, K., Pruitt, K. D., Schoch, C. L., Sherry, S. T., et al. (2021). GenBank. Nucleic Acids Res. 49, D92–D96. doi: 10.1093/nar/gkaa1023

PubMed Abstract | CrossRef Full Text | Google Scholar

Schrodinger, L. L. C. (2010). The PyMOL molecular graphics system. Version 1:5.

Google Scholar

Shakya, A., Bhat, H. R., and Ghosh, S. K. (2017). Update on nitazoxanide: a multifunctional chemotherapeutic agent. Curr. Drug Discov. Technol. 15, 201–213. doi: 10.2174/1570163814666170727130003

PubMed Abstract | CrossRef Full Text | Google Scholar

Shen, L., Liu, F., Huang, L., Liu, G., Zhou, L., and Peng, L. (2022). VDA-RWLRLS: an anti-SARS-CoV-2 drug prioritizing framework combining an unbalanced bi-random walk and Laplacian regularized least squares. Comput. Biol. Med. 140:105119. doi: 10.1016/j.compbiomed.2021.105119

PubMed Abstract | CrossRef Full Text | Google Scholar

Tang, X. F., Cai, L. J., Meng, Y. J., Xu, J. L., Lu, C. C., and Yang, J. L. (2020). Indicator regularized non-negative matrix factorization method-based drug repurposing for COVID-19. Front. Immunol. 11:603615. doi: 10.3389/fimmu.2020.603615

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, C., Wang, S., and Li, D. (2020). Human intestinal defensin 5 inhibits SARS-CoV-2 invasion by cloaking ACE2. Gastroenterology 159:1145. doi: 10.1053/j.gastro.2020.05.015

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, C., Wang, S., Li, D., Chen, P., Han, S., Hao, G., et al. (2021). Human cathelicidin inhibits SARS-CoV-2 infection: killing two birds with one stone. ACS Infect. Dis. 7, 1545–1554. doi: 10.1021/acsinfecdis.1c00096

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, F., Huang, Z. A., Chen, X., Zhu, Z. X., Wen, Z. K., Zhao, J. Y., et al. (2017). LRLSHMDA: laplacian regularized least squares for human microbe-disease association prediction. Sci. Rep. 7:7601. doi: 10.1038/s41598-017-08127-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J., Wang, C., Shen, L., Zhou, L., and Peng, L. (2021). Screening potential drugs for COVID-19 based on bound nuclear norm regularization. Front. Genet. 12:749256. doi: 10.3389/fgene.2021.749256

PubMed Abstract | CrossRef Full Text | Google Scholar

WHO (2021). Available online at: https://covid19.who.int (accessed December 5, 2021).

Google Scholar

Wishart, D. S., Feunang, Y. D., Guo, A. C., Lo, E. J., Marcu, A., Grant, J. R., et al. (2018). DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082. doi: 10.1093/nar/gkx1037

PubMed Abstract | CrossRef Full Text | Google Scholar

Wu, C. R., Liu, Y., Yang, Y. Y., Zhang, P., Zhong, W., Wang, Y. L., et al. (2020). Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods. Acta Pharm. Sin. 10, 766–788. doi: 10.1016/j.apsb.2020.02.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, J. L., Huang, T., Song, W. M., Petralia, F., Mobbs, C. V., Zhang, B., et al. (2016). Discover the network underlying the connections between aging and age-related diseases. Sci. Rep. 6:32566. doi: 10.1038/srep32566

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, J. L., Peng, S. N., Zhang, B., Houten, S., Schadt, E., Zhu, J., et al. (2020). Human geroprotector discovery by targeting the converging subnetworks of aging and age-related diseases. Geroscience 42, 353–372. doi: 10.1007/s11357-019-00106-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, L., Yang, P., Feng, H., Zhao, Q., and Liu, H. (2021). Using network distance analysis to predict lncRNA-miRNA interactions. Interdiscip. Sci. 13, 535–545. doi: 10.1007/s12539-021-00458-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y. F., Wang, X. G., Kaushik, A. C., Chu, Y. Y., Shan, X. Q., Zhao, M. Z., et al. (2020). SPVec: a Word2vec-inspired feature representation method for drug-target interaction prediction. Front. Chem. 7:895. doi: 10.3389/fchem.2019

CrossRef Full Text | Google Scholar

Zhou, L. Q., Wang, J. J., Liu, G. Y., Lu, Q. Q., Dong, R. Y., Tian, G., et al. (2020). Probing antiviral drugs against SARS-CoV-2 through virus-drug association prediction based on the KATZ method. Genomic 112, 4427–4434. doi: 10.1016/j.ygeno.2020.07.044

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: anti-SARS-CoV-2 drug, virus-drug association, logistic matrix factorization, kernel diffusion, molecular docking

Citation: Tian X, Shen L, Gao P, Huang L, Liu G, Zhou L and Peng L (2022) Discovery of Potential Therapeutic Drugs for COVID-19 Through Logistic Matrix Factorization With Kernel Diffusion. Front. Microbiol. 13:740382. doi: 10.3389/fmicb.2022.740382

Received: 13 July 2021; Accepted: 01 February 2022;
Published: 28 February 2022.

Edited by:

Anastasios Chanalaris, Kromek, United Kingdom

Reviewed by:

Bence Bolgár, Budapest University of Technology and Economics, Hungary
Wen Zhang, Huazhong Agricultural University, China
XianFang Tang, Hunan University, China
Dongqing Wei, Shanghai Jiao Tong University, China
Bing Wang, Anhui University of Technology, China

Copyright © 2022 Tian, Shen, Gao, Huang, Liu, Zhou and Peng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Liqian Zhou, zhoulq11@163.com; Lihong Peng, plhhnu@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.