Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data and Preprocessing
2.2. Boruta Feature Filtering
2.3. Feature Ranking Algorithms
2.3.1. Max-Relevance and Min-Redundancy
2.3.2. Monte Carlo Feature Selection
2.3.3. Light Gradient Boosting Machine
2.3.4. Least Absolute Shrinkage and Selection Operator
2.4. Incremental Feature Selection
2.5. Synthetic Minority Oversampling Technique
2.6. Classification Algorithm
2.6.1. Decision Tree
2.6.2. k-Nearest Neighbor
2.6.3. Random Forest
2.6.4. Support Vector Machine
2.7. Performance Evaluation
3. Results
3.1. Results of Boruta and Feature Ranking Algorithms
3.2. Results of IFS Method on Four Feature Lists
3.2.1. Results of IFS Method on LASSO Feature List
3.2.2. Results of IFS Method on LightGBM Feature List
3.2.3. Results of IFS Method on MCFS Feature List
3.2.4. Results of IFS Method on mRMR Feature List
3.3. Results of Intersection of Optimal Features on Different Feature Lists
3.4. Classification Rules
4. Discussion
4.1. Features Associated with COVID-19 Severity
4.2. Decision Rules Related to the Severity of COVID-19 Infections
4.3. Comparsion of the Previous Study
- (1)
- The mutation features analyzed in our study were generated by a different platform. As different platforms have different advantages and disadvantages, analyzing mutation features obtained from different platforms can uncover novel mutations that are highly correlated with the clinical status of patients with COVID-19;
- (2)
- More machine learning algorithms were used in this study than in Nagy et al.’s study. We used five feature analysis methods (Boruta, LASSO, LightGBM, MCFS and mRMR) and four classification algorithms (DT, KNN, RF and SVM). With these algorithms, each mutation feature was fully evaluated. The final mutation features were selected by multiple feature analysis methods, increasing the reliability of the results;
- (3)
- In this study, we not only discovered mutation features related to the clinical status of the patients with COVID-19 but also established rules to indicate more complicated mutation patterns of the clinical status of the patients with COVID-19. These patterns always included multiple mutation features, suggesting the relationships between a combination of mutation features and the clinical status of patients with COVID-19. Such form can be deemed an extension of single mutation biomarkers, and was not involved in Nagy et al.’s study;
- (4)
- Biological analysis was performed in our study, increasing the reliability of the results. In Nagy et al.’s study, the important mutation features were only listed and were not analyzed.
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- CSG International. The species severe acute respiratory syndrome-related coronavirus: Classifying 2019-ncov and naming it SARS-CoV-2. Nat. Microbiol. 2020, 5, 536. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhou, B.; Thao, T.T.N.; Hoffmann, D.; Taddeo, A.; Ebert, N.; Labroussaa, F.; Pohlmann, A.; King, J.; Steiner, S.; Kelly, J.N. SARS-CoV-2 spike d614g change enhances replication and transmission. Nature 2021, 592, 122–127. [Google Scholar] [CrossRef] [PubMed]
- Hou, Y.J.; Chiba, S.; Halfmann, P.; Ehre, C.; Kuroda, M.; Dinnon, K.H.; Leist, S.R.; Schäfer, A.; Nakajima, N.; Takahashi, K. SARS-CoV-2 d614g variant exhibits efficient replication ex vivo and transmission in vivo. Science 2020, 370, 1464–1468. [Google Scholar] [CrossRef] [PubMed]
- Pachetti, M.; Marini, B.; Benedetti, F.; Giudici, F.; Mauro, E.; Storici, P.; Masciovecchio, C.; Angeletti, S.; Ciccozzi, M.; Gallo, R.C. Emerging SARS-CoV-2 mutation hot spots include a novel rna-dependent-rna polymerase variant. J. Transl. Med. 2020, 18, 179. [Google Scholar] [CrossRef] [Green Version]
- Cui, J.; Li, F.; Shi, Z.-L. Origin and evolution of pathogenic coronaviruses. Nat. Rev. Microbiol. 2019, 17, 181–192. [Google Scholar] [CrossRef] [Green Version]
- Marra, M.A.; Jones, S.J.; Astell, C.R.; Holt, R.A.; Brooks-Wilson, A.; Butterfield, Y.S.; Khattra, J.; Asano, J.K.; Barber, S.A.; Chan, S.Y. The genome sequence of the sars-associated coronavirus. Science 2003, 300, 1399–1404. [Google Scholar] [CrossRef] [Green Version]
- Wan, Y.; Shang, J.; Graham, R.; Baric, R.S.; Li, F. Receptor recognition by the novel coronavirus from wuhan: An analysis based on decade-long structural studies of sars coronavirus. J. Virol. 2020, 94, e00127–e00200. [Google Scholar] [CrossRef] [Green Version]
- Leung, K.; Shum, M.H.; Leung, G.M.; Lam, T.T.; Wu, J.T. Early transmissibility assessment of the n501y mutant strains of SARS-CoV-2 in the united kingdom, october to november 2020. Eurosurveillance 2021, 26, 2002106. [Google Scholar] [CrossRef]
- Mwenda, M.; Saasa, N.; Sinyange, N.; Busby, G.; Chipimo, P.J.; Hendry, J.; Kapona, O.; Yingst, S.; Hines, J.Z.; Minchella, P.; et al. Detection of b.1.351 SARS-CoV-2 variant strain—Zambia, december 2020. MMWR Morb. Mortal. Wkly. Rep. 2021, 70, 280–282. [Google Scholar] [CrossRef]
- Faria, N.R.; Mellan, T.A.; Whittaker, C.; Claro, I.M.; Candido, D.d.S.; Mishra, S.; Crispim, M.A.; Sales, F.C.; Hawryluk, I.; McCrone, J.T. Genomics and epidemiology of the p. 1 SARS-CoV-2 lineage in manaus, brazil. Science 2021, 372, 815–821. [Google Scholar] [CrossRef]
- Mlcochova, P.; Kemp, S.A.; Dhar, M.S.; Papa, G.; Meng, B.; Ferreira, I.A.; Datir, R.; Collier, D.A.; Albecka, A.; Singh, S. SARS-CoV-2 b. 1.617. 2 delta variant replication and immune evasion. Nature 2021, 599, 114–119. [Google Scholar] [CrossRef] [PubMed]
- Callaway, E. Heavily mutated coronavirus variant puts scientists on alert. Nature 2021, 600, 21. [Google Scholar] [CrossRef]
- Wang, Z.; Schmidt, F.; Weisblum, Y.; Muecksch, F.; Barnes, C.O.; Finkin, S.; Schaefer-Babajew, D.; Cipolla, M.; Gaebler, C.; Lieberman, J.A. Mrna vaccine-elicited antibodies to SARS-CoV-2 and circulating variants. Nature 2021, 592, 616–622. [Google Scholar] [CrossRef]
- Abdullahi, I.N.; Emeribe, A.U.; Ajayi, O.A.; Oderinde, B.S.; Amadu, D.O.; Osuji, A.I. Implications of SARS-CoV-2 genetic diversity and mutations on pathogenicity of COVID-19 and biomedical interventions. J. Taibah Univ. Med. Sci. 2020, 15, 258–264. [Google Scholar] [CrossRef] [PubMed]
- Nagy, Á.; Ligeti, B.; Szebeni, J.; Pongor, S.; Gyrffy, B. Covidoutcome-estimating covid severity based on mutation signatures in the SARS-CoV-2 genome. Database J. Biol. Databases Curation 2021, 2021, baab020. [Google Scholar] [CrossRef] [PubMed]
- Tzou, P.L.; Tao, K.; Nouhin, J.; Rhee, S.Y.; Hu, B.D.; Pai, S.; Parkin, N.; Shafer, R.W. Coronavirus antiviral research database (cov-rdb): An online database designed to facilitate comparisons between candidate anti-coronavirus compounds. Viruses 2020, 12, 1006. [Google Scholar] [CrossRef] [PubMed]
- Kurtz, S.; Phillippy, A.; Delcher, A.L.; Smoot, M.; Shumway, M.; Antonescu, C.; Salzberg, S.L. Versatile and open software for comparing large genomes. Genome Biol. 2004, 5, R12. [Google Scholar] [CrossRef] [Green Version]
- Brodin, P. Immune determinants of COVID-19 disease presentation and severity. Nat. Med. 2021, 27, 28–33. [Google Scholar] [CrossRef]
- Brodin, P. Why is COVID-19 so mild in children? Acta Paediatr. 2020, 109, 1082–1083. [Google Scholar] [CrossRef]
- Kursa, M.B.; Rudnicki, W.R. Feature selection with the boruta package. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Ding, S.; Li, H.; Zhang, Y.H.; Zhou, X.; Feng, K.; Li, Z.; Chen, L.; Huang, T.; Cai, Y.D. Identification of pan-cancer biomarkers based on the gene expression profiles of cancer cell lines. Front. Cell Dev. Biol. 2021, 9, 781285. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.; Zhang, Y.H.; Wang, S.; Zhang, Y.; Huang, T.; Cai, Y.D. Prediction and analysis of essential genes using the enrichments of gene ontology and kegg pathways. PLoS ONE 2017, 12, e0184129. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yu, X.; Pan, X.; Zhang, S.; Zhang, Y.H.; Chen, L.; Wan, S.; Huang, T.; Cai, Y.D. Identification of gene signatures and expression patterns during epithelial-to-mesenchymal transition from single-cell expression atlas. Front. Genet. 2020, 11, 605012. [Google Scholar] [CrossRef] [PubMed]
- Peng, H.; Fulmi, L.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Anal. Mach. Intell. IEEE Trans. 2005, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed]
- Micha, D.; Rada-Iglesias, A.; Enroth, S.; Wadelius, C.; Koronacki, J.; Komorowski, J. Monte carlo feature selection for supervised classification. Bioinformatics 2008, 24, 110–117. [Google Scholar]
- Li, J.; Lu, L.; Zhang, Y.H.; Xu, Y.; Liu, M.; Feng, K.; Chen, L.; Kong, X.; Huang, T.; Cai, Y.D. Identification of leukemia stem cell expression signatures through monte carlo feature selection strategy and support vector machine. Cancer Gene Ther. 2020, 27, 56–69. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finely, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NIP 2017). 2017. Available online: https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html (accessed on 11 April 2022).
- Breiman; Leo. Better subset regression using the nonnegative garrote. Technometrics 1995, 37, 373–384. [Google Scholar] [CrossRef]
- Tibshirani, R.J. Regression shrinkage and selection via the lasso. J. R. Stat. Society. Ser. B Methodol. 1996, 73, 273–282. [Google Scholar] [CrossRef]
- Liu, H.; Setiono, R. Incremental feature selection. Appl. Intell. 1998, 9, 217–230. [Google Scholar] [CrossRef]
- Chen, L.; Zeng, T.; Pan, X.; Zhang, Y.H.; Huang, T.; Cai, Y.D. Identifying methylation pattern and genes associated with breast cancer subtypes. Int. J. Mol. Sci. 2019, 20, 4269. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.H.; Guo, W.; Zeng, T.; Zhang, S.; Chen, L.; Gamarra, M.; Mansour, R.F.; Escorcia-Gutierrez, J.; Huang, T.; Cai, Y.D. Identification of microbiota biomarkers with orthologous gene annotation for type 2 diabetes. Front. Microbiol. 2021, 12, 711244. [Google Scholar] [CrossRef] [PubMed]
- Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence— Volume 2; Morgan Kaufmann Publishers Inc.: Montreal, QC, Canada, 1995; pp. 1137–1143. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Pan, X.; Chen, L.; Liu, I.; Niu, Z.; Huang, T.; Cai, Y.D. Identifying protein subcellular locations with embeddings-based node2loc. IEEE/ACM Trans. Comput. Biol. Bioinform. 2022, 19, 666–675. [Google Scholar] [CrossRef] [PubMed]
- Safavian, S.R.; Landgrebe, D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef] [Green Version]
- Gorodkin, J. Comparing two k-category assignments by a k-category correlation coefficient. Comput. Biol. Chem. 2004, 28, 367–374. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Chen, L.; Li, Z.; Zhang, S.; Zhang, Y.-H.; Huang, T.; Cai, Y.-D. Predicting rna 5-methylcytosine sites by using essential sequence features and distributions. BioMed Res. Int. 2022, 2022, 4035462. [Google Scholar] [CrossRef]
- Ding, S.; Wang, D.; Zhou, X.; Chen, L.; Feng, K.; Xu, X.; Huang, T.; Li, Z.; Cai, Y. Predicting heart cell types by using transcriptome profiles and a machine learning method. Life 2022, 12, 228. [Google Scholar] [CrossRef]
- Zhou, X.; Ding, S.; Wang, D.; Chen, L.; Feng, K.; Huang, T.; Li, Z.; Cai, Y.-D. Identification of cell markers and their expression patterns in skin based on single-cell rna-sequencing profiles. Life 2022, 12, 550. [Google Scholar] [CrossRef]
- Li, X.; Lu, L.; Chen, L. Identification of protein functions in mouse with a label space partition method. Math. Biosci. Eng. 2022, 19, 3820–3842. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Chen, L. Identification of drug–disease associations by using multiple drug and disease networks. Curr. Bioinform. 2022, 17, 48–59. [Google Scholar] [CrossRef]
- Wu, Z.; Chen, L. Similarity-based method with multiple-feature sampling for predicting drug side effects. Comput. Math. Methods Med. 2022, 2022, 9547317. [Google Scholar] [CrossRef] [PubMed]
- Chen, W.; Chen, L.; Dai, Q. Impt-fdnpl: Identification of membrane protein types with functional domains and a natural language processing approach. Comput. Math. Methods Med. 2021, 2021, 7681497. [Google Scholar] [CrossRef] [PubMed]
- Baranwal, M.; Magner, A.; Elvati, P.; Saldinger, J.; Violi, A.; Hero, A.O. A deep learning architecture for metabolic pathway prediction. Bioinformatics 2019, 36, 2547–2553. [Google Scholar] [CrossRef]
- Casanova, R.; Saldana, S.; Chew, E.Y.; Danis, R.P.; Greven, C.M.; Ambrosius, W.T. Application of random forests methods to diabetic retinopathy classification analyses. PLoS ONE 2014, 9, e98587. [Google Scholar] [CrossRef]
- Sang, X.; Xiao, W.; Zheng, H.; Yang, Y.; Liu, T. Hmmpred: Accurate prediction of DNA-binding proteins based on hmm profiles and xgboost feature selection. Comput. Math. Methods Med. 2020, 2020, 1384749. [Google Scholar] [CrossRef] [Green Version]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Farkas, C.; Mella, A.; Haigh, J.J. Large-scale population analysis of SARS-CoV-2 whole genome sequences reveals host-mediated viral evolution with emergence of mutations in the viral spike protein associated with elevated mortality rates. medRxiv 2020. [Google Scholar] [CrossRef]
- Hahn, G.; Wu, C.M.; Lee, S.; Hecker, J.; Lutz, S.M.; Haneuse, S.; Qiao, D.; DeMeo, D.; Choudhary, M.C.; Etemad, B. Two mutations in the SARS-CoV-2 spike protein and rna polymerase complex are associated with COVID-19 mortality risk. bioRxiv 2020. [Google Scholar] [CrossRef]
- Ozono, S.; Zhang, Y.; Ode, H.; Sano, K.; Tan, T.S.; Imai, K.; Miyoshi, K.; Kishigami, S.; Ueno, T.; Iwatani, Y. SARS-CoV-2 d614g spike mutation increases entry efficiency with enhanced ace2-binding affinity. Nat. Commun. 2021, 12, 848. [Google Scholar] [CrossRef] [PubMed]
- Korber, B.; Fischer, W.M.; Gnanakaran, S.; Yoon, H.; Theiler, J.; Abfalterer, W.; Hengartner, N.; Giorgi, E.E.; Bhattacharya, T.; Foley, B. Tracking changes in SARS-CoV-2 spike: Evidence that d614g increases infectivity of the COVID-19 virus. Cell 2020, 182, 812–827.e819. [Google Scholar] [CrossRef] [PubMed]
- Nagy, Á.; Pongor, S.; Győrffy, B. Different mutations in SARS-CoV-2 associate with severe and mild outcome. Int. J. Antimicrob. Agents 2021, 57, 106272. [Google Scholar] [CrossRef]
- Guan, W.-J.; Ni, Z.-Y.; Hu, Y.; Liang, W.-H.; Ou, C.-Q.; He, J.-X.; Liu, L.; Shan, H.; Lei, C.-L.; Hui, D.S. Clinical characteristics of 2019 novel coronavirus infection in china. medRxiv 2020. [Google Scholar] [CrossRef] [Green Version]
- Davies, N.G.; Klepac, P.; Liu, Y.; Prem, K.; Jit, M.; Eggo, R.M. Age-dependent effects in the transmission and control of COVID-19 epidemics. Nat. Med. 2020, 26, 1205–1211. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, T.T.; Pham, T.N.; Van, T.D.; Nguyen, T.T.; Nguyen, D.T.N.; Le, H.N.M.; Eden, J.-S.; Rockett, R.J.; Nguyen, T.T.H.; Vu, B.T.N. Genetic diversity of SARS-CoV-2 and clinical, epidemiological characteristics of COVID-19 patients in hanoi, vietnam. PLoS ONE 2020, 15, e0242537. [Google Scholar] [CrossRef] [PubMed]
- Eaaswarkhanth, M.; Al Madhoun, A.; Al-Mulla, F. Could the d614g substitution in the SARS-CoV-2 spike (s) protein be associated with higher COVID-19 mortality? Int. J. Infect. Dis. 2020, 96, 459–460. [Google Scholar] [CrossRef]
- Patro, L.P.P.; Sathyaseelan, C.; Uttamrao, P.P.; Rathinavelan, T. Global variation in SARS-CoV-2 proteome and its implication in pre-lockdown emergence and dissemination of 5 dominant SARS-CoV-2 clades. Infect. Genet. Evol. 2021, 93, 104973. [Google Scholar] [CrossRef]
- Chaudhari, A.; Chaudhari, M.; Mahera, S.; Saiyed, Z.; Nathani, N.M.; Shukla, S.; Patel, D.; Patel, C.; Joshi, M.; Joshi, C.G. In-silico analysis reveals lower transcription efficiency of c241t variant of SARS-CoV-2 with host replication factors madp1 and hnrnp-1. Inform. Med. Unlocked 2021, 25, 100670. [Google Scholar] [CrossRef]
Term | Decision Tree | k-Nearest Neighbor | Random Forest | Support Vector Machine |
---|---|---|---|---|
Number of features | 57 | 57 | 6 | 6 |
SN | 0.788 | 0.824 | 0.821 | 0.820 |
SP | 0.832 | 0.616 | 0.734 | 0.734 |
ACC | 0.811 | 0.714 | 0.775 | 0.775 |
MCC | 0.621 | 0.447 | 0.555 | 0.554 |
Precision | 0.808 | 0.658 | 0.735 | 0.735 |
F1-measure | 0.798 | 0.732 | 0.776 | 0.775 |
G-mean | 0.809 | 0.712 | 0.776 | 0.776 |
Term | Decision Tree | k-Nearest Neighbor | Random Forest | Support Vector Machine |
---|---|---|---|---|
Number of features | 24 | 52 | 24 | 24 |
SN | 0.844 | 0.878 | 0.813 | 0.816 |
SP | 0.769 | 0.605 | 0.768 | 0.760 |
ACC | 0.804 | 0.734 | 0.789 | 0.787 |
MCC | 0.612 | 0.498 | 0.580 | 0.575 |
Precision | 0.766 | 0.666 | 0.759 | 0.754 |
F1-measure | 0.803 | 0.758 | 0.785 | 0.783 |
G-mean | 0.805 | 0.729 | 0.790 | 0.788 |
Term | Decision Tree | k-Nearest Neighbor | Random Forest | Support Vector Machine |
---|---|---|---|---|
Number of features | 43 | 55 | 10 | 10 |
SN | 0.848 | 0.832 | 0.816 | 0.811 |
SP | 0.755 | 0.637 | 0.704 | 0.704 |
ACC | 0.799 | 0.730 | 0.757 | 0.755 |
MCC | 0.603 | 0.476 | 0.521 | 0.516 |
Precision | 0.757 | 0.673 | 0.712 | 0.711 |
F1-measure | 0.800 | 0.745 | 0.760 | 0.758 |
G-mean | 0.800 | 0.728 | 0.758 | 0.756 |
Term | Decision Tree | k-Nearest Neighbor | Random Forest | Support Vector Machine |
---|---|---|---|---|
Number of features | 53 | 52 | 24 | 23 |
SN | 0.777 | 0.885 | 0.666 | 0.665 |
SP | 0.846 | 0.598 | 0.915 | 0.916 |
ACC | 0.813 | 0.734 | 0.797 | 0.797 |
MCC | 0.625 | 0.501 | 0.604 | 0.604 |
Precision | 0.819 | 0.665 | 0.875 | 0.877 |
F1-measure | 0.797 | 0.759 | 0.757 | 0.756 |
G-mean | 0.810 | 0.728 | 0.781 | 0.780 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, F.; Chen, L.; Guo, W.; Zhou, X.; Feng, K.; Huang, T.; Cai, Y. Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method. Life 2022, 12, 806. https://doi.org/10.3390/life12060806
Huang F, Chen L, Guo W, Zhou X, Feng K, Huang T, Cai Y. Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method. Life. 2022; 12(6):806. https://doi.org/10.3390/life12060806
Chicago/Turabian StyleHuang, Feiming, Lei Chen, Wei Guo, Xianchao Zhou, Kaiyan Feng, Tao Huang, and Yudong Cai. 2022. "Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method" Life 12, no. 6: 806. https://doi.org/10.3390/life12060806