Original Article
Sequence analysis for SNP detection and phylogenetic reconstruction of SARS-cov-2 isolated from Nigerian COVID-19 cases

https://doi.org/10.1016/j.nmni.2022.100955Get rights and content
Under a Creative Commons license
open access

Abstract

Background

Coronaviruses are a group of viruses that belong to the Family Coronaviridae, genus Betacoronavirus. In December 2019, a new coronavirus disease (COVID-19) characterized by severe respiratory symptoms was discovered. The causative pathogen was a novel coronavirus known as 2019-nCoV and later as SARS-CoV-2. Within two months of its discovery, COVID-19 became a pandemic causing widespread morbidity and mortality.

Methodology

Whole genome sequence data of SARS-CoV-2 isolated from Nigerian COVID-19 cases were retrieved by downloading from GISAID database. A total of 18 sequences that satisfied quality assurance (length ≥29,700 nts and number of unknown bases denoted as “N” ≤ 5%) were used for the study. In addition, genome sequence of SARS-CoV-2 obtained from Nigeria's COVID-19 index case (Accession ID: EPI_ISL_413550) and the reference genome (Accession NC_ 045512.2) were obtained from GISAID and the GenBank databases, respectively. Multiple sequence alignment (MSA) was done in MAFFT (Version 7.471) while SNP calling was implemented in DnaSP (Version 6.12.03), respectively and then visualized in Jalview (Version 2.11.1.0). Phylogenetic analysis was with MEGA X software.

Results

Nigerian SARS-CoV-2 had 99.9% genomic similarity with four large conserved genomic regions. A total of 66 SNPs were identified out of which 31 were informative. Nucleotide diversity assessment gave Pi = 0.00048 and average SNP frequency of 2.22 SNPs per 1000 nts. Non-coding genomic regions particularly 5′UTR and 3′UTR had a SNP density of 3.77 and 35.4, respectively. The region with the highest SNP density was ORF10 with a frequency of 8.55 SNPs/1000 nts). This value was significantly higher (P < 0.01) than that of the spike gene, the region of greatest interest in SARS-CoV-2 genomics. Majority (72.2%) of viruses in Nigeria are of L lineage with preponderance of D614G mutation which accounted for 11 (61.1%) out of the 18 viral sequences. Nigeria SARS-CoV-2 revealed 3 major clades namely Oyo, Ekiti and Osun on a maximum likelihood phylogenetic tree.

Conclusion and Recommendation

There was a preponderance of L lineage (to include the new lineage scheme) and D614G mutants. Nigerian SARS-CoV-2 genome revealed ORF1ab as the region containing the highest SNP density as compared to the spike gene. The implication of this distribution of SNPs for the empirical lower infectivity of SARS-CoV-2 in Nigeria is discussed. This also underscores the need for more aggressive testing and treatment of COVID-19 in Nigeria. Additionally, attempt to produce testing kits for COVID-19 in Nigeria should consider the conserved regions identified in this study. Strict adherence to COVID-19 preventive measure is recommended in view of Nigerian SARS-CoV-2 phylogenetic clustering pattern, which suggests intensive community transmission possibly rooted in communal culture characteristic of many ethnicities in Nigeria.

Keywords

COVID-19
Nigeria
phylogeny
SARS-CoV-2
SNPs

Cited by (0)