Research article Special Issues

A deep bidirectional recurrent neural network for identification of SARS-CoV-2 from viral genome sequences


  • Received: 13 August 2021 Accepted: 22 September 2021 Published: 15 October 2021
  • In this work, Deep Bidirectional Recurrent Neural Networks (BRNNs) models were implemented based on both Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) cells in order to distinguish between genome sequence of SARS-CoV-2 and other Corona Virus strains such as SARS-CoV and MERS-CoV, Common Cold and other Acute Respiratory Infection (ARI) viruses. An investigation of the hyper-parameters including the optimizer type and the number of unit cells, was also performed to attain the best performance of the BRNN models. Results showed that the GRU BRNNs model was able to discriminate between SARS-CoV-2 and other classes of viruses with a higher overall classification accuracy of 96.8% as compared to that of the LSTM BRNNs model having a 95.8% overall classification accuracy. The best hyper-parameters producing the highest performance for both models was obtained when applying the SGD optimizer and an optimum number of unit cells of 80 in both models. This study proved that the proposed GRU BRNN model has a better classification ability for SARS-CoV-2 thus providing an efficient tool to help in containing the disease and achieving better clinical decisions with high precision.

    Citation: Mohanad A. Deif, Ahmed A. A. Solyman, Mehrdad Ahmadi Kamarposhti, Shahab S. Band, Rania E. Hammam. A deep bidirectional recurrent neural network for identification of SARS-CoV-2 from viral genome sequences[J]. Mathematical Biosciences and Engineering, 2021, 18(6): 8933-8950. doi: 10.3934/mbe.2021440

    Related Papers:

  • In this work, Deep Bidirectional Recurrent Neural Networks (BRNNs) models were implemented based on both Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) cells in order to distinguish between genome sequence of SARS-CoV-2 and other Corona Virus strains such as SARS-CoV and MERS-CoV, Common Cold and other Acute Respiratory Infection (ARI) viruses. An investigation of the hyper-parameters including the optimizer type and the number of unit cells, was also performed to attain the best performance of the BRNN models. Results showed that the GRU BRNNs model was able to discriminate between SARS-CoV-2 and other classes of viruses with a higher overall classification accuracy of 96.8% as compared to that of the LSTM BRNNs model having a 95.8% overall classification accuracy. The best hyper-parameters producing the highest performance for both models was obtained when applying the SGD optimizer and an optimum number of unit cells of 80 in both models. This study proved that the proposed GRU BRNN model has a better classification ability for SARS-CoV-2 thus providing an efficient tool to help in containing the disease and achieving better clinical decisions with high precision.



    加载中


    [1] R. Lu, X. Zhao, J. Li, P. Niu, B. Yang, H. Wu, et al., Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding, Lancet, 395 (2020), 565-574. doi: 10.1016/S0140-6736(20)30251-8
    [2] M. A. Deif, A. A. A. Solyman, R. E. Hammam, ARIMA Model Estimation Based on Genetic Algorithm for COVID-19 Mortality Rates, Int. J. Inf. Technol. Decis. Mak., (2021), 1-24.
    [3] C. Wang, P. W. Horby, F. G. Hayden, G. F. Gao, A novel coronavirus outbreak of global health concern, Lancet, 395 (2020), 470-473. doi: 10.1016/S0140-6736(20)30185-9
    [4] D. Cucinotta, M. Vanelli, WHO declares COVID-19 a pandemic, Acta Bio. Med. Atenei Parm., 91 (2020), 157.
    [5] M. Deif, R. Hammam, A. Solyman, Adaptive Neuro-Fuzzy Inference System (ANFIS) for Rapid Diagnosis of COVID-19 Cases Based on Routine Blood Tests, Int. J. Intell. Eng. Syst., 2020.
    [6] Rational use of personal protective equipment for coronavirus disease (COVID-19) and considerations during severe shortages: interim guidance, World Health Organization, 2020.
    [7] J. Yang, Inhibition of SARS-CoV-2 Replication by Acidizing and RNA Lyase-Modified Carbon Nanotubes Combined with Photodynamic Thermal Effect, J. Explor. Res. Pharmacol., (2020), 1-6.
    [8] M. Pal, G. Berhanu, C. Desalegn, V. Kandi, Severe acute respiratory syndrome Coronavirus-2 (SARS-CoV-2): An update, Cureus, 12 (2020), 3.
    [9] P. C. Y. Woo, Y. Huang, S. K. P. Lau, K. Y. Yuen, Coronavirus genomics and bioinformatics analysis, Viruses, 2 (2010), 1804-1820. doi: 10.3390/v2081803
    [10] N. Decaro, V. Mari, G. Elia, D. D. Addie, M. Camero, M. S. Lucente, et al., Recombinant canine coronaviruses in dogs, Europe, Emerg. Infect. Dis., 16 (2010), 41.
    [11] M. Pachetti, B. Marini, F. Benedetti, F. Giudici, E. Mauro, P. Storici, et al., Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant, J. Transl. Med., 18 (2020), 1-9. doi: 10.1186/s12967-019-02189-8
    [12] L. Peñarrubia, M. Ruiz, R. Porco, S. N. Rao, M. Juanola-Falgarona, D. Manissero, et al., Multiple assays in a real-time RT-PCR SARS-CoV-2 panel can mitigate the risk of loss of sensitivity by new genomic variants during the COVID-19 outbreak, Int. J. Infect. Dis., 2020.
    [13] W. R. Pearson, Rapid and sensitive sequence comparison with FASTP and FASTA, 1990.
    [14] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman, Basic local alignment search tool, J. Mol. Biol., 215 (1990), 403-410.
    [15] L. Pinello, G. L. Bosco, G. C. Yuan, Applications of alignment-free methods in epigenomics, Brief Bioinf., 5 (2014), 419-430.
    [16] S. Vinga, J. Almeida, Alignment-free sequence comparison-a review, Bioinformatics, 19 (2003), 513-523. doi: 10.1093/bioinformatics/btg005
    [17] D. Bzhalava, J. Ekström, F. Lysholm, E. Hultin, H. Faust, B. Persson, et al., Phylogenetically diverse TT virus viremia among pregnant women, Virology, 432 (2012), 427-434. doi: 10.1016/j.virol.2012.06.022
    [18] A. Tampuu, Z. Bzhalava, J. Dillner, R. Vicente, ViraMiner: Deep learning on raw DNA sequences for identifying viral genomes in human samples, PLoS One, 14 (2019), e0222271.
    [19] S. M. Naeem, M. S. Mabrouk, S. Y. Marzouk, M. A. Eldosoky, A diagnostic genomic signal processing (GSP)-based system for automatic feature analysis and detection of COVID-19, Brief Bioinf., 2020.
    [20] M. A. Deif, R. E. Hammam, A. Solyman, Gradient Boosting Machine Based on PSO for prediction of Leukemia after a Breast Cancer Diagnosis, Int. J. Adv. Sci. Eng. Inf. Technol., 11 (2021), 508-515. doi: 10.18517/ijaseit.11.2.12955
    [21] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature, 521 (2015), 436-444.
    [22] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks, 61 (2015), 85-117. doi: 10.1016/j.neunet.2014.09.003
    [23] M. Wainberg, D. Merico, A. Delong, B. J. Frey, Deep learning in biomedicine, Nat. Biotechnol., 36 (2018), 829-838.
    [24] Y. Kim, Convolutional neural networks for sentence classification, preprint, arXiv: 1408.5882.
    [25] A. Lopez-Rincon, A. Tonda, L. Mendoza-Maldonado, E. Claassen, J. Garssen, A. D. Kraneveld, Accurate identification of sars-cov-2 from viral genome sequences using deep learning, bioRxiv, 2020.
    [26] M. A. Deif, R. E. Hammam, Skin lesions classification based on deep learning approach, J. Clin. Eng., 45 (2020), 155-161. doi: 10.1097/JCE.0000000000000405
    [27] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, et al., Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal Process. Mag., 29 (2012), 82-97.
    [28] N. G. Nguyen, V. A. Tran, D. L. Ngo, D. Phan, F. R. Lumbanraja, M. R. Faisal, et al., DNA sequence classification by convolutional neural network, J. Biomed. Sci. Eng., 9 (2016), 280.
    [29] China National Center for Bioinformation, 2019 Novel Coronavirus Resource (2019nCoVR), 2020, https://bigd.big.ac.cn/ncov/?lang=en.
    [30] A. Vabret, T. Mourez, S. Gouarin, J. Petitjean, F. Freymuth, An outbreak of coronavirus OC43 respiratory infection in Normandy, France, Clin. Infect. Dis., 36 (2013), 985-989.
    [31] L. J. Cui, C. Zhang, T. Zhang, R. J. Lu, Z. D. Xie, L. L. Zhang, et al., Human coronaviruses HCoV-NL63 and HCoV-HKU1 in hospitalized children with acute respiratory infections in Beijing, China, Adv. Virol., 2011 (2011).
    [32] F. Y. Zeng, C. W. M. Chan, M. N. Chan, J. D. Chen, K. Y. C. Chow, C. C. Hon, et al., The complete genome sequence of severe acute respiratory syndrome coronavirus strain HKU-39849 (HK-39), Exp. Biol. Med., 28 (2003), 866-873.
    [33] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, Adv. Neural Inf. Process. Syst., 26 (2013), 3111-3119.
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2796) PDF downloads(135) Cited by(14)

Article outline

Figures and Tables

Figures(9)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog