Elsevier

Gene Reports

Volume 23, June 2021, 101020
Gene Reports

Genome-wide in silico identification and characterization of Simple Sequence Repeats in diverse completed SARS-CoV-2 genomes

https://doi.org/10.1016/j.genrep.2021.101020Get rights and content

Highlights

  • The distribution patterns of mono-, di-, tri-, and hexa-nucleotide repeats in SARS-CoV-2 genomes were analyzed.

  • Different SARS-CoV-2 genomes isolated from 31 different countries exhibit similar distribution pattern.

  • Correlation between genome size and GC content of SARS-CoV-2 with the incidence of SSR distribution were established.

Abstract

Simple sequence repeats (SSRs) or, Microsatellites are short repeat sequences that have been extensively studied in eukaryotic (plants) and prokaryotic (bacteria) organisms. Compared to other organisms, the presence and incidence of SSR on viral genomes are less studied. With the emergence of novel infectious viruses over the past few decades, it is imperative to study the genetic diversity in such viruses to predict their evolutionary and functional changes over time. Following the emergence of SARS-CoV-2, we have assembled 121 complete genomes reported from 31 countries across the six continents for the identification and characterization of SSR repeats. Using two independent SSR identification tools, we have found remarkable consistency in the diversity of microsatellites pattern (38–42 per genome) found in the 121 analyzed SARS-CoV-2 genomes indication their important role for genome stability. Among the identified motifs, trinucleotide and hexanucleotide repeats were found to be the most abundant form followed by mono- and di-nucleotide. There were no tetra- or penta-nucleotide repeats in the analyzed SARS-CoV-2 genomes. The discovery of microsatellites in SARS-CoV-2 genomes may become useful for the population genetics, evolutionary analysis, strain identification and genetic variation.

Abbreviations

COVID-19
coronavirus disease 2019
SARS-CoV-2
severe acute respiratory syndrome coronavirus 2
SSR
simple sequence repeats
RD
relative density
RA
relative abundance
SpliMNPV
Spodoptera littoralis multiple nucleopolyhedrovirus
HCV
hepatitis C virus

Keywords

Microsatellite
SARS-CoV-2 virus
Simple sequence repeat
Genome sequence
Comparative genomics

Cited by (0)

View Abstract