Estimating the Relative Proportions of SARS-CoV-2 Strains from Wastewater Samples
17 Pages Posted: 8 Feb 2022 Publication Status: Published
More...Abstract
Wastewater surveillance has become essential for monitoring the spread of SARS-CoV-2. The quantification of SARS-CoV-2 RNA in wastewater correlates with the Covid-19 caseload in a community. However, estimating the proportions of different SARS-CoV-2 strains has remained technically difficult. We present a method for estimating the relative proportions of SARS-CoV-2 strains from wastewater samples. The method uses an initial step to remove unlikely strains, imputation of missing nucleotides using the global SARS-CoV-2 phylogeny, and an Expectation-Maximization (EM) algorithm for obtaining maximum likelihood estimates of the proportions of different strains in a sample. Using simulations with a reference database of >3 million SARS-CoV-2 genomes, we show that the estimated proportions accurately reflect the true proportions given sufficiently high sequencing depth and that the phylogenetic imputation is highly accurate and substantially improves the reference database.
Funding: This work used the Extreme Science and Engineering Discovery Environment (XSEDE) Bridges-2 system at the Pittsburgh Supercomputing Center through allocation BIO180028 and was supported by NIH grant 1R01GM138634-01.
Declaration of Interests: We declare that we have no known competing financial interests or personal relationships that influenced this work.
Keywords: Wastewater Surveillance, SARS-CoV-2, Expectation-Maximization, imputation
Suggested Citation: Suggested Citation