Introduction

The outbreak of the COVID-19 pandemic caused by SARS-CoV-2 at the beginning of 2020 has shocked the population worldwide. A year and a half later (August 2021), about 200 million confirmed cases of COVID-19 have been reported by WHO included more than 4.2 million deaths (https://covid19.who.int/). As expected, in such a mankind threatening situation, the scientific community put in place a great effort to help countering the spread of the virus, as evidenced among the other things by the huge number of papers dealing with various aspects of the disease appeared in the literature. For instance, the LitCovid literature hub1 has collected around 160,000 articles as of August 2021 covering arguments categorized as general, mechanism, transmission, diagnosis, treatment, prevention, case report and forecasting.

As regards the COVID-19 treatment, the race to the vaccine against SARS-CoV-2 started immediately after the isolation of the viral genome2 and gave the first results as soon as December 2020. Moreover, despite the exploration of different approaches like, e.g., the infusion of plasma from human survivors3, the pharmacological option, namely small molecule drugs and antibodies, is being actively pursued. However, the route to a new drug is long and costly, and the classical drug discovery pipeline is not compatible with the need of rapid intervention on a population of millions of patients. At the moment, a viable alternative seems to be the repurposing of known drugs4, i.e., the use for the treatment of COVID-19 of drugs currently on the market for different therapeutic purposes.

Known drugs that are currently in clinical or pre-clinical study for the treatment of COVID-19 are aimed either at inhibiting viral or human targets involved in some of the processes of viral entry and replication, or at treating inflammation and tissue injury consequent to the viral infection5,6. Even though it might seem that a direct antiviral approach could lead to a straightforward solution, only few of the existing antivirals have performed well in the clinic so far. On the other hand, a number of drugs used for the most disparate therapeutic indications and entered into clinical trials even with an uncertain rationale7 are showing preliminary promising results. However, as it has been observed8, a real “repurposing tsunami” has invested the biomedical community, so much so that today it is difficult not only to keep track of the results of the trials, but also to follow the new proposals.

With the aim of helping researchers navigate the sea of outcomes and reports coming from the studies on COVID-19, some institutions and companies have developed online platforms that collect and organize both literature and data, eventually providing free access to the latter. For example, the already mentioned LitCovid hub1 (https://www.ncbi.nlm.nih.gov/research/coronavirus/) is a daily updated source of relevant articles retrieved from PubMed. Other platforms dealing with data on drugs and chemicals, like, e.g., CHEMBL9 (https://www.ebi.ac.uk/chembl/), PubChem10 (https://pubchem.ncbi.nlm.nih.gov/), or DrugBank11 (https://www.drugbank.ca/), have introduced special sections dedicated to COVID-19-related information. In addition, more specialized resources have appeared on the web to help accessing and analyzing COVID-19 data, mainly in the fields of epidemiology, genomics, interactomics, and, to a lesser extent, pharmacology. In this class of web tools, it is worth mentioning CORDITE (CORona Drug InTEractions database)12, a web interface that provides a database of potential drugs, targets, interactions, and relative publications obtained from a manually curated selection of literature sources. With the same purpose of facilitating the data analysis, the COVID-19 Drug and Gene Set Library was built as an online collection of COVID-19 related drugs and genes13. A comprehensive critical review on this kind of web tools has recently been published by Mercatelli et al.14.

Considering the great amount of valuable scientific information that has already been produced and published, and that will be presumably produced for some time more on COVID-19 related topics, it could be useful to look at the whole scenario of results, to foster the acquisition of that knowledge that can only emerge from consideration of both the totality and the complexity of data. In other words, and limiting ourselves to the pharmacological treatment issue, one might think of presenting and analyzing the information on proposed drugs in a way that takes into account not only the different types of data (chemical, biological, genomic, etc.), but also the relationships among them, that is on a network basis. The context is that of network medicine15. An attempt in this direction has recently been proposed by Korn et al.16, who developed a knowledgebase and an online platform (COVID-KOP) to integrate the existing biomedical information with the newly acquired knowledge on COVID-19. By means of this web tool, one can easily produce an aggregate graph connecting, e.g., COVID-19 phenotypic features to a drug studied for treating the disease, through the genes known to be linked to both. Still in the context of network medicine, CoVex is another platform that offers the user the possibility to explore the SARS-CoV-2 virus–host–drug interactome for drug repurposing aims17. In addition, we want to mention CovMulNet1918 that at present looks like the most thorough network-based tool allowing to integrate the available genotypic and phenotypic information on COVID-19, like, SARS-CoV-2 proteins, their human partners, as well as symptoms, diseases, and drugs. Finally, Coronavirus canSAR19 is a freely available resource that offers druggable interactomes of SARS-CoV-2 proteins and human proteins, as well as reports about 3D structures, drugs, and clinical trials.

In a specifically drug-focused context, the network medicine approach assumes the overcoming of the old “one drug, one target, one disease” concept in favor of a more outright “multi-drug, multi-target, multi-disease” approach20. The exploration of a such complex system of interactions can be aided by the construction of a drug-target network21. In reference to the COVID-19 case, drug-target networks based on host–virus protein–protein interactions (PPIs) have already been built and examined22,23,24 with the aim of repurposing already approved drugs.

Here, we present the COVID-19 Drugs Networker (COVIDrugNet: http://compmedchem.unibo.it/covidrugnet), a web application that offers a different point of view on anti-COVID-19 drugs by allowing a network-based analysis of the DrugBank dataset of potential repurposed drugs currently in clinical trial. The freely accessible application automatically retrieves the data from DrugBank, builds the drug-target network, and allows the user to carry out some basic network analysis. Moreover, we show how, using COVIDrugNet, some peculiar aspects of the proposed pharmacological options against COVID-19, in terms of substances, targets, and their interrelationships can be revealed. Although what is reported here is an instant analysis based on current data, the continuous updating of COVIDrugNet will allow us to follow the future development of the drugs proposed for the treatment of the disease, thus providing an always updated view of the COVID-19 system pharmacology.

Results and discussion

COVID-19 Drugs Networker

The COVID-19 Drugs Networker (COVIDrugNet, Fig. 1) is a web tool designed for the exploration of the landscape of the drugs currently in clinical trials to combat the SARS-CoV-2 infection. The web app is based on a network approach that supports both visualization and analysis of the complex scenario of repurposed drugs for the COVID-19 and related conditions. The core of the web tool are the interactive graphs and the additional features that allow one to explore drug and target data, as well as networks properties. The main graph represents a bipartite Drug-Target network (DT network, Fig. 2a), where the nodes are drugs and targets that are connected if a relation between them is reported in DrugBank. Since bipartite networks are usually investigated by compressing their information into two monopartite networks called projections25, COVIDrugNet provides two of such networks only having drugs or targets as nodes: in the following, we refer to them as Drug and Target projections (DP and TP, respectively; Fig. 2b, c).

Figure 1
figure 1

The COVIDrugNet web tool. A screenshot of the main block of the Drug-Target Network page. It displays the fundamental features accessible in the web tool that allow the user to inspect the network and its properties.

Figure 2
figure 2

COVIDrugNet Networks. The three networks generated and available for inspection in COVIDrugNet. (a) Drug-Target Bipartite Network. It is the main network, and it is built connecting drugs currently in clinical trial present in the COVID-19 Dashboard of DrugBank11 and their reported targets. The red nodes are drugs, and the light blue ones are targets. (b) Drug Projection. It is built from the Drug-Target network and contains only drugs. The nodes are color coded on the basis of their first level ATC codes (retrieved from DrugBank11). (c) Target Projection. It is built from the Drug-Target network and contains only targets. The nodes are color coded according to their protein class (retrieved from ChEMBL9). The networks were generated by means of the Python package NetworkX26.

As regards the user interface, it is basically divided into the main and the Advanced Tools blocks. The first one allows users to immediately access the main body of information, capturing the holistic view of the current drug repurposing status for COVID-19. However, a more in-depth examination of the data is possible, by taking advantage of some more specialized graph analysis tools provided in Advanced Tools.

In detail, the main block includes the graph, and the Charts and Plots and Graph Properties sections (Fig. 1). As mentioned before, the heart of each web page is the interactive graph with its related information box (Node Info) that provides a summary documentation of single drug/target nodes hovered over or individually selected. The box contains links to some databases providing the available information related to individual properties of both drugs and targets. In addition, a multiple node selection brings up the Inspected Data hidden section that displays detailed information of the selected nodes in a tabular format. By the way, networks and tables can be downloaded in different formats to allow an external analysis of the data.

Node coloring options are provided, useful to visualize some node attributes related to therapeutic, biological, or network-based features. For instance, in the DP graph (Fig. 2b) the user can decide to color the nodes according to the Anatomical Therapeutic Chemical (ATC) code or the clinical trial phase, while in the TP graph (Fig. 2c) the color coding allows one to spot protein family, protein class or cellular location. Moreover, in the DT network and in both projections, it is possible to color the nodes based on some network attributes—i.e., degree, centrality measures or node grouping—considering the entire graph or the major component. To examine all these properties at a glance, the web tool also provides the Charts and Plots section, in which the pie charts—or bar chart in the case of the ATC code coloring option—are updated accordingly to the node coloring option to show the relative proportions between the values of that property. In this area of the projection web pages, the web tool also provides the plot of the nodes degree distribution. Among the graph interactive features, the Highlight a node dropdown menu is useful to find nodes by name, and the button HIGHLIGHT BY PROPERTY allows a customized filtering on node properties to highlight and/or download a specific nodes selection. In the Graph Properties section, some centrality measures useful to analyze the network topology are displayed in a downloadable table. A short explanation of each computed property is provided in a Glossary in the Help page.

Regarding the Advanced Tools block, it contains three sections: Clustering, Advanced Degree Distribution and Current Virus–Host–Drug Interactome. The Clustering section is dedicated to the node grouping analysis carried out through different methods (see Nodes Grouping section in “Methods” section). In particular, we thought it could be of interest to examine the grouping of the nodes in the projection graphs, as, e.g., in perspective it might reveal possible trends in the selection of drugs to be repurposed or privileged areas of intervention in the biology of the infected cells. To this aim, the web app allows for three different techniques of investigation of the networks partitioning: spectral analysis combined with K-means clustering27, Girvan–Newman28 and greedy modularity community detection29 methods. The plot in this Clustering section reports either the eigenvalues distribution used in the application of the spectral clustering method, or the modularity trend in the Girvan–Newman community detection method. Both plots are interactive and allow the user to choose the level (number) of grouping.

The Advanced Degree Distribution section presents an interactive chart of the degree distribution and some of its possible distribution fittings compared to those of an Erdős–Rényi equivalent graph (see Degree Distribution Fitting section in “Methods” section).

Finally, the Current Virus–Host–Drug Interactome section displays a bipartite network built on the basis of experimental studies and checked for protein targets present in the DT network (see below for details). As mentioned before, the network table is downloadable, to provide interested users with the possibility of rebuilding and manipulating the graph.

Graphs analysis

In Fig. 2, the graphs representing the networks generated by COVIDrugNet are displayed. The DT network is a disconnected network with a large connected component accounting for 85.1% of nodes (1248 out of 1466). This structure reminds that of the general drug-target network reported elsewhere21, where most drugs have more than one target and several drugs can share the same target(s). However, from inspection of the graph, it immediately appears that there are two drug nodes that heavily affect the network topology by showing an exceedingly high degree compared to all other nodes: Fostamatinib and Artenimol, having 305 and 186 direct neighbors, respectively. For both drugs, this reflects a number of reported targets that is considerably higher than the average (< 7), being 6.9 and 4.2 times higher, respectively, than that of Cannabidiol that, with 44 targets, is the third in rank for the highest number of neighbors in the DT network. Indeed, these two drugs show a peculiar behavior strongly affecting the network structure not only in the DT, but consequently also in the TP graph where they cause the formation of two highly intra-connected clumps of nodes. To take this aspect under consideration and possibly clarify its role in respect to the topology of both the whole drug-target network and the projections, in the following, we compared the results of the network analyses carried out on the entire networks and on the graphs containing all nodes except Artenimol, Fostamatinib and their exclusive direct neighbors.

As a first step in the analysis, we tried to assess the character of the monopartite projection networks DP (290 nodes) and TP (1176 nodes), i.e., whether they belong to the random network category or are scale-free. Scale-free networks have a characteristic organization, in which there is a limited number of nodes with a high number of neighbors (called hubs) and an abundance of nodes having a low degree30. This arrangement can be found in plenty of real-world networks, from the World Wide Web to citations in science, from social interactions to metabolic maps30,31. Both DP and TP show a significant difference from an equivalent (same number of nodes and probability of edge creation) Erdős–Rényi graph32 (Figure S1). To further investigate on the scale-freeness of the networks, we considered three properties for each graph: the degree distribution, the relationship between clustering coefficient and degree, and the ability to withstand targeted attacks compared to random failures.

In order to address the scale-free character of both networks by evaluating the fitness of the degree distribution to a power-law, we employed the approach reported by Broido et al.33, which applied a previously defined rigorous method34. This analysis was carried out on both the entire DP and TP networks and also in cases where Artenimol and Fostamatinib as well as their exclusive direct neighbors were removed.

In the DP network, the degree distribution could be described by a power-law, suggesting that these networks are plausibly scale-free (Figure S2a, b). However, other heavy-tailed distributions cannot be ruled out. The situation for the TP network is less clear-cut (Figure S2c, d), at least in the case of the entire network. To advance an explanation for these results, we observe that, these networks are small, such that they would probably not provide enough data for clearly electing a distribution form. Still, they are unequivocally dissimilar to random networks.

The inspection of both the clustering coefficient and the robustness evaluation is best illustrated considering the two projections one at a time.

Looking at the DP network and specifically at its clustering coefficient, it shows a tendency to decrease as the degree increases (Figure S3a, b), implicating the existence of a few hubs connecting peripheral nodes of high degree. Also, there is an evident distinction between the response to a targeted attack and to a random failure35 (Figure S4a). In the first case, nodes with the highest degree are progressively removed from the network, causing it to break apart quickly. On the other hand, if the nodes to be dismissed are chosen randomly, the connectedness of the network is almost unaffected. Notably, these findings are strengthened by the fact that carrying out the same investigation on a network from which Artenimol and Fostamatinib are excluded, leads to almost identical results (Figure S4b).

The same examination carried out on the TP network does not yield equally unambiguous conclusions. As stated above, the targets linked to Artenimol and Fostamatinib compose two almost-clique aggregations, which distort the morphology of the network. The relationship between clustering coefficient and degree is strongly dependent on the presence of these two exceptionally connected drugs (Figure S3c, d). When they are not taken into account, the inverse proportionality is fairly visible. Nevertheless, if they are considered, the scatterplot displaying this relationship is warped, due to the formation of two separate but remarkably dense groups representing the targets connected to Artenimol and Fostamatinib. The check of the robustness of the network by comparing the responses to targeted attacks or random failures gives a result that agrees with that obtained from the DP network. The communities related to the two “super-spreaders” simply introduce a delay in the fragmentation of the network, since they are made of a multitude of nodes with equally high degree (Figure S4c, d). Anyhow, this shift does not alter the network robustness to random failures and the susceptibility to targeted attacks.

As a final remark on the networks organization, we stress that all results and conclusions presented here are just a snapshot of the continuously evolving COVID-19 drug repurposing scene, and that it will be worthwhile to follow the time progression of this system. For instance, in the future, the growth of the network could smooth out or even hide the effects of Artenimol and Fostamatinib that now we observe so evidently. In respect of this, we recognize a different response of the DP and TP networks to the influence of these nodes. The former is less affected, since the vast majority of the targets related to both drugs are not shared by others, such that the information related to these proteins vanishes in the projection process. On the contrary, the latter suffers a huge impact, showing a situation that is antithetical to the previous one. Here, the proteins amass together constituting two highly intra-connected jumbles, which are poorly linked to the rest of the network. A continuous growth and the ability of self-organizing are two key features of scale-free networks, which frequently describe real complex systems30. These characteristics are shown by both projections, and indeed their scale-freeness is supported by their degree distribution, the relation of clustering coefficient to degree, and their robustness. Mainly due to the influence of Artenimol and Fostamatinib, these properties are manifest in the DP network, but not so neat in the TP one.

Applications to COVID-19 repurposed drugs: network-based inferences

To illustrate the capabilities of COVIDrugNet, in the following we report some example considerations that can be derived from the analysis of the DT network, and of the projection graphs relating to drugs (DP) and targets (TP).

Drugs

Examining the DP network with nodes colored by ATC code (https://www.whocc.no/atc/structure_and_principles/) (Fig. 2b) can reveal at a glance which therapeutic areas are mostly covered by the repurposed drugs presently in clinical trials. In the Charts and Plots section of the COVIDrugNet Drug Projection page, the nodes categories distribution is shown, from which it appears that all the 14 main anatomical/pharmacological groups (1st level codes) are represented, even though with different numbers of drugs. Not taking into consideration the 50 substances for which an ATC code is not yet reported, the remaining 240 drugs are distributed in three top ranked groups: C (Cardiovascular system), A (Alimentary tract and metabolism), and J (Antiinfectives for systemic use) comprising 49, 39, and 38 active substances, respectively (Fig. 3). Then, two other highly populated ATC groups follow: B (Blood and blood forming organs), and L (Antineoplastic and immunomodulating agents) both counting 31 drugs. By considering the composition of the bars that reports the distribution of drugs in the 3rd level groups for each 1st level ATC code (visible in the web tool), one can have a more detailed picture of the actual pharmacological approaches to COVID-19 treatment. First, it is worth noting that the drugs belonging to the J group are located mostly out of the main connected component of the graph, accordingly to the fact that they share a target with a very small number of other drugs. Conversely, substances of the A and C groups mostly populate the main connected component, indicating a high level of promiscuity among them as regards the targets. Also, we observe that most drugs classified in the C, A, J, B, L, N, and P groups show just one ATC code, while drugs in D, G, R, S, M, and H belong to more than one ATC group.

Figure 3
figure 3

First Level ATC Code Distribution. A bar chart displaying the count of nodes for every first level ATC code (anatomical/pharmacological main group). The total count is higher than the number of nodes in the DP, because more than one ATC code can be assigned to a single drug.

Even though the ATC system is not aimed at providing direct therapeutic indications and considering also that more than one code can be assigned to individual medicines, the landscape of pharmacological interventions against the SARS-CoV-2 infection emerging from the DP network appears rather intricate. Overall, it mostly confirms that the drugs in clinical trials are aimed at contrasting both the viral infection process (antivirals in J group, agents acting on the renin-angiotensin system in C group), and its pathological consequences at systemic level (substances in A, B, L, and other groups). These approaches are in line with evidence recognizing that, as the severity of the COVID-19 increases—apparently in consequence of a dysregulated host immune response—various pathophysiological mechanisms are activated leading to hematological (mainly thromboembolic) manifestations and, eventually, multi-organ dysfunctions36,37. In addition, bacterial superinfections have been reported in COVID-19 patients, and even though the issue is still debated38, antibiotics belonging to the J group are actually in the current treatment guidelines39.

Indeed, even this brief analysis of the ATC codes distribution among the substances currently in clinical trials highlights a complex and multifaceted drug repurposing scenario consequent to the fact that the COVID-19 is a multi-systemic disease requiring a well-equipped therapeutic armamentarium and possibly a combined poly-pharmacological intervention40.

To provide an example of using COVIDrugNet focused on a group of drugs, we could take into consideration the inhibition of the virus attachment and entry into the host cell. It is believed that SARS-CoV-2 enters the target cell mainly through an endocytic pathway that exploits the ability of its Spike (S) protein to bind the human Angiotensin-converting enzyme 2 (ACE2) receptor. Subsequently, S is cleaved by the Transmembrane protease serine 2 (TMPRSS2) to provide the S2 subunit necessary for the membrane fusion41,42. The drug repurposing activity aimed at preventing this step of the viral infection points to blocking the protein targets ACE2 and TMPRSS2, or to raising the endosomal pH in order to prevent the S processing6. To retrieve information on drugs in clinical trial for this purpose, we can highlight the target nodes Angiotensin-converting enzyme 2 and Transmembrane protease serine 2 in the DT network of COVIDrugNet and check the “Inspected targets” table below for drugs reported to bind those targets. Here, we can find among others Chloroquine (CQ), Hydroxychloroquine (HCQ), and Bromhexine that are reported as ACE2 binders, and Camostat and Bromhexine reported as TMPRSS2 inhibitors. Notably, it is known that CQ and HCQ are also able to raise the endo-lysosomal pH thus inhibiting the protease activities and preventing the cleavage of S protein43. In addition, recent evidence suggests the combined use of Camostat and CQ (together with another drug, arbidol, an inhibitor of the virus–host cell membrane fusion with no known targets) to contrast the entry routes of SARS-CoV-244. Finally, in the DP graph, one can select all the mentioned drugs and check the status of the clinical trials in which they are involved in the “Node Info” box on the right.

Targets

The TP network of Fig. 2c is a targetome that shows the relationships among the known targets of the proposed COVID-19 drugs. Here, two nodes (proteins) are linked if they are reported as targets of at least one of the drugs in the DrugBank COVID-19 database, and in this sense it is different from a typical interactome based on PPIs. The network is made by 1176 nodes and 70,873 edges and shows a main connected component comprising 1037 nodes (88.2%). Human targets are 1008 (909 in the main connected component).

Looking at this graph provides another point of view on the pharmacological approaches taken to contrast the COVID-19. The network of the targets involved in the action of the drugs in clinical trials helps one to obtain a comprehensive view of the biological processes affected by the action of drugs. Actually, from the analysis of the target proteins and their interactions it could be possible to trace the cellular pathways influenced by drugs. A study in this regard is currently underway. Instead, starting from the TP network, we carried out a different analysis that took into consideration both the data here presented on repurposed drugs now in clinical trials (a top-down view), and the molecular data on SARS-CoV-2 infection obtained from recent experimental studies and exploited to propose drugs to be repurposed (a bottom-up view). As regards the latter, we refer to the human-virus interactomes developed by Gordon et al.23 and more recently by Chen et al.45. These interactomes are PPI networks that show which human proteins are bound directly by SARS-CoV-2 proteins to allow the virus to enter into the human cells, replicate, assemble and be released. Both research groups followed an experimental approach to identify the human proteins, using affinity purification (AP), and AP together with proximity labeling-based techniques, respectively, coupled with mass spectrometry. Merging the Gordon and Chen results, we obtained an extended list of 732 human proteins experimentally identified as interactors of the 29 viral proteins. Comparing this list with that of the human drug targets of the TP network (1008), we found that only 45 out of the 732 human proteins able to bind the viral ones are present in the TP as reported targets of drugs in clinical trials. In Fig. 4, we show the integrated host–virus interactome (also available in the Advanced Tools block of COVIDrugNet), where the 45 proteins common to both lists are highlighted (yellow circles). We also checked the DT network of Fig. 2a for drugs associated with these 45 targets and found 29 substances acting on them (Table 1) shown in the interactome of Fig. 4 (green squares) linked to their targets. This is an example of how the information provided by the COVIDrugNet DT interactome can complement the one contained in human-virus PPI networks like those of Gordon and Chen. Note that the 29 substances hit direct neighbors of the viral proteins, thus interfering with the related viral processes. We see from Fig. 4 that Artenimol and Fostamatinib, seemingly by virtue of their high target promiscuity, are able to hit simultaneously several targets, thus affecting various viral processes and allowing to foresee a better therapeutic efficacy. If confirmed by clinical results, these would be clear cases of poly-pharmacological multi-target actions exerted by single substances, a nice fit into the paradigm of network pharmacology.

Figure 4
figure 4

Virus–Host–Drug Interactome. The Virus–Host–Drug Interactome built on the basis of the merged datasets from Gordon et al.23 and Chen et al.45. Proteins (circles) are displayed in red if viral and in blue if human. The human proteins present in the TP network are shown as yellow circles, and the corresponding drugs currently in clinical trials against COVID-19 as green squares. The human proteins binding more than one viral target are highlighted as blue circles with pink contour. The network visualization was generated through Cytoscape 3.8.246.

Table 1 Protein-drug associations for common targets between the virus–host interactome and the drug-target network.

Another interesting aspect emerging from inspection of the interactome of Fig. 4 is that 20 human proteins (blue circles with pink contour) bind to two viral targets, thus acting as bridges between two node communities and playing a key role in the formation of the large connected component of the graph (Table 2). From a drug discovery perspective, such proteins would be ideal targets to fight the virus, as neutralizing them would help to disrupt the network of PPIs necessary to carry on the viral infection and replication processes. Unfortunately, none of these proteins appear in the TP network, implying that there is no substance targeting them among those listed in the DrugBank database of repurposed drugs presently in clinical trial. However, we browsed some databases (DrugBank11, DrugCentral47 and ChEMBL9) in the search for bioactive substances reported to bind these 20 proteins and found that 4 of them are reported as targets of known drugs (Table 3). As can be seen from Table 3, many of the drugs listed therein have not yet been considered for therapy, while some of them (bold in the table) are already in clinical trial for COVID-19 treatment even though their action on the proteins in the interactome is not reported in DrugBank. The former ones could be further possible candidates for COVID-19 drug repurposing in light of their ability to interfere with more than one process critical for the virus.

Table 2 Human proteins that interact with more than one viral protein in the virus–host interactome.
Table 3 Known drugs targeting human proteins that interact with more than one viral protein in the virus–host interactome.

Limitations

Our study is not exempt from some drawbacks that are common in data analysis, and regard mainly the data availability and quality. We based COVIDrugNet on the DrugBank Dashboard dedicated to COVID-19 pandemic, and although this public and free resource is known for the high reliability of the datasets, missing data or delayed updating can occur. This is evident for some drugs under clinical trial shown in Table 3 that have known targets yet not reported in their DrugBank file. Moreover, not all the drugs or proteins investigated here are completely characterized and classified, and this adds some uncertainty and noise to our results. Also, some bias could be incorporated in the knowledge we started from. For instance, the number of targets associated to a specific drug could considerably depend on the amount of research carried out on that medicine rather than on the actual biological interactions it has. This issue could be partially mitigated by a more extensive integration of data from a wider variety of databases. Very similar considerations can be drawn on the other databases that we exploited to retrieve auxiliary data: STRING48, DisGeNet49, SWISS-MODEL50, RCSB-PDB51, UniProt52 and ChEMBL9. Furthermore, as mentioned in the “Methods” section, the drug-target network was built considering only protein targets, hence nucleic acid targets were not included. However, biomolecular targets other than proteins are a minority53 and this led us to not integrate them.

Despite the massive efforts of the scientific community, SARS-CoV-2 and COVID-19 continue to be largely puzzling. Experimental assays are the solid ground on which we all start to build our hypotheses, yet also these investigations may have bias and a moderate amount of uncertainty. We have to keep this into consideration, when examining the merged interactomes by Gordon et al.23 and Chen et al.45 given their considerable difference. Additionally, the identification of a PPI in vitro unfortunately does not guarantee that the same interaction occurs also in vivo.

Conclusions

The COVID-19 pandemic poses a huge problem of public health that requires the implementation of all available approaches to contrast it, and drugs are one of them. In this context, we observed an unmet need of depicting the continuous evolving scenario of the ongoing drug clinical trials through an easy-to-use freely accessible online tool. Starting from this consideration, we developed COVIDrugNet, a web app that allows one to watch and keep up to date on how the drug research is responding with its arsenal of known repurposed drugs to the health threat represented by the SARS-CoV-2 infection. We have shown some examples of how one could explore the whole landscape of medicines currently in clinical trial and try to probe the consistency of actual treatments with the biological evidence being accumulated on the virus infection and its systemic pathological consequences in humans. The complex network of protein targets affected by the repurposed drugs can be confronted with the host–virus interactome, and this may offer new hints on drugs currently in use or to be proposed for clinical investigation. From this comparison, we have been able to single out some human proteins that contact two viral counterparts, and that might be possible new targets for anti-COVID-19 drugs. Finally, given that, as already noticed by others7, several treatments proposed for COVID-19 are still lacking a known mechanism of viral inhibition or even a pharmacological rationale, careful analyses of the drug-target data as those reported in the present work might help to understand the molecular implications of these pharmacological options, and eventually improve the search for more effective therapies.

Methods

Data acquisition

The set of drugs in clinical trial for the treatment of COVID-19 (731 on August 11, 2021) was retrieved from the dedicated web page of DrugBank (https://www.drugbank.ca/covid-19). Both experimental unapproved substances, and drugs in clinical trials were considered, and duplicates were removed (more than one trial is going on for some drugs). The set was also filtered for both the number of heavy atoms (to exclude inorganic compounds), and the availability of data (a drug was not added if it was not present in the PubChem database). This cleaning step reduced the number of drugs considered to 397. From the same site and from PubChem, we gathered some features related to structure, as well as pharmacological classification, pharmacodynamics, and pharmacokinetics of each drug (Table 4). Drugs for which no targets were reported in DrugBank were discarded (290 remaining). As regards the drug targets, they were retrieved from DrugBank, and in this case we collected some information on classification, biology, and pharmacology of each protein. A detailed description of the features and the data sources are reported in Table 5.

Table 4 Drugs features.
Table 5 Targets features.

Networks construction

We chose to inspect the data in the form of a graph. All networks presented in the web app and in the paper were built by means of the NetworkX software26.

Network analysis

Node attributes

Some suitable node attributes (Degree, Closeness Centrality, Betweenness Centrality, Eigenvector Centrality, Clustering Coefficient, VoteRank) were calculated through NetworkX. The only property we tweaked was the result of the VoteRank because its algorithm draws up a ranking of nodes based on an iterative voting system54 without assigning a specific value to each one of them. Thus, we translated this ranking into a score for each node on the basis of its position and the total number of nodes with the following method:

figure a

Nodes grouping

Dividing a network into groups, clusters, or communities could be useful to unveil non-trivial patterns of interaction. It is accomplished by splitting the network into subgroups that have the fewest possible number of connections between them55. In this work, we took advantage of (and provide access to in the web tool) three of the most common algorithms for this purpose: spectral clustering27, Girvan–Newman community detection28, and greedy modularity community detection29.

The first one makes use of the spectrum of the graph Laplacian to convey the information about the graph partition27. The division is then carried out on this data by a k-means clustering algorithm (see the Supplementary Information for more detail). In the second case, communities are recognized employing the Girvan–Newman method28. It is a hierarchical method based on the progressive removal of the edges with the highest betweenness centrality from the graph, causing it to break into sets of smaller constituents. The partition with the best modularity is shown, but the user can manually choose an arbitrary number of communities in the web tool. The greedy modularity community detection method29 pursues the graph division through a bottom-up approach (opposite to the previous one), by exploiting a “greedy” algorithm that progressively associates the nodes into groups that maximize the modularity. It starts with all nodes separated into single communities and recursively merges the couple of them that brings to the highest modularity increasing, until the point that joining two communities would lead to a modularity reduction.

These tasks were accomplished through in-house Python scripts, mainly making use of the packages NetworkX and Scikit-learn56.

Degree distribution fitting

A network is commonly considered to be scale-free if the degree distribution of its nodes follows a power-law30, which has the form:

$$p\left( x \right) \propto x^{ - \alpha }$$

where the scaling exponent α is higher than 1 (usually between 2 and 3) and the degree value x is equal or greater than xmin (which is always higher than 1). To the best of our knowledge, the most severe scale-freeness test is presented by Broido et al.33 that take advantage of a rigorous mathematical procedure34 to assess the validity of a power-law distribution to describe the investigated degrees. Here, we followed their approach probing the fitting of a power-law to the degree distributions of both projected networks DP and TP (with and without the Artenimol and Fostamatinib nodes). As a first step, the parameters of the best fitting power-law are determined (xmin with a standard Kolmogorov–Smirnov minimization approach, and then α with a discrete maximum likelihood estimation) employing the Python package Powerlaw57. Then, the fitting is evaluated considering the p-value of the Kolmogorov–Smirnov distance (computed with a semi-parametric bootstrap), and of the xmin and α (bootstrap). If p ≥ 0.1, the degree distribution is considered plausibly scale-free. Lastly, the chosen power-law distribution is compared to four non-scale-free alternatives (using loglikelihood ratio tests), to evaluate if it is favored over the others. Such alternatives are the exponentially truncated power-law, the exponential, the stretched exponential (Weibull) and the lognormal. This entire procedure was carried out using an in-house Python script, with a large employment of the Python package Powerlaw. A more thorough explanation and method validation are provided in the Supplementary Information.

Robustness

Scale-free networks (contrary to random Erdős–Rényi graphs) have an exceptional tolerance against random failures, but at the same time they are very vulnerable to targeted attacks35. We investigated the robustness of these networks evaluating their diameter (as a measure of interconnectivity) throughout a process of node removal. We took into account both targeted attacks and random failures and compared the results. In the first case, at every iteration the node with the highest degree was chosen and removed. In the other case, a node was selected randomly and eliminated. In this latter condition, the average of multiple 100 runs was considered in order to avoid misinterpretations induced by a single random choice. This procedure was carried out through an in-house Python script.

COVIDrugNet implementation and deployment

COVIDrugNet is mainly composed by the collector and the web tool itself. Both are written in Python, but the purpose of the former is to collect the data from web databases, build the graphs, compute some properties, and store everything in pickle format. The latter, instead, retrieves the data from the created database and sets up the front-end part of COVIDrugNet with Python Dash58. The web tool deployment was carried out with Apache59 through the mod_wsgi interface in an Ubuntu server.