Abstract
Eight peaks of coronavirus disease 2019 (COVID-19) outbreak occurred in Japan, each associated with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern. The National Epidemiological Surveillance of Infectious Diseases (NESID) analyzed viral genome sequences from symptomatic patients and submitted the results to GISAID. Meanwhile, commercial testing services occasionally sequence samples from asymptomatic individuals. We compared a total of 1248 SARS-CoV-2 full-genome sequences obtained from the SB Coronavirus Inspection Center Corp. (SBCVIC) during Japan’s seventh wave, which was dominated by Omicron variants, with 1764 sequences obtained in Japan from GISAID during the same period using chronological phylogenies and molecular transmission networks. The number of SBCVIC sequences was consistent with the number of cases reported by NESID. The SBCVIC detected a shift in the PANGO lineage from BA.2 to BA.5 earlier than that of GISAID. BA.2 lineages from the SBCVIC were distributed at different locations in the transmission network dominated by GISAID entries, whereas BA.5 lineages from SBCVIC and GISAID often formed distinct subclusters. Test-based sentinel surveillance of asymptomatic individuals may be a more manageable approach compared to notifiable disease surveillance; however, it may not necessarily capture all infection populations throughout Japan.
Similar content being viewed by others
Introduction
Similar to other countries, Japan has been severely affected by the coronavirus disease 2019 (COVID-19) pandemic. As of May 10, 2023, the Japanese National Epidemiological Surveillance of Infectious Diseases (NESID) reported 33,793,429 cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, with a low number of related deaths at 74,6541. Various factors, including the progress of the vaccination program, have maintained low case and death numbers in Japan since the spring of 20232. On May 8, 2023, the Japanese government eased stringent COVID-19 rules and guidelines, including recommendations for mask-wearing, avoiding crowded places, setting quarantine periods, and hospitalizing positive individuals in designated medical facilities2,3, similar to managing seasonal influenza. Under the current measure, COVID-19 surveillance has transitioned from daily notifiable disease surveillance to a weekly sentinel surveillance in some hospitals.
In Japan, population-based notifiable surveillance for COVID-19 is based on confirmatory diagnosis using real-time PCR testing of symptomatic cases at local public health sites and some clinical laboratories. Next-generation sequencing (NGS) is introduced for some samples with sufficient viral load. The National Institute of Infectious Diseases compiles NGS results after quality control and matches them with the individual’s background in the NESID4. These results are made publicly available by uploading them to the GISAID EpiCoV database5,6,7. Such viral genome data offer useful information for public health interventions, including monitoring emerging variants, detecting outbreaks, and describing the background and risk factors of transmission cluster8.
To apply them for public health measures, it is necessary to ensure that the data reflect the viral genome distribution and individual characteristics of the infected population of interest9. During the daily notifiable surveillance period in Japan, passive COVID-19 testing was performed on symptomatic individuals at these sites. Therefore, asymptomatic, mild illness, or delayed testing individuals were less likely to be reported10,11, and few cases with low viral loads could be provided as the sequence data. This suggests that the delayed diagnosis and the sampling bias are concerns for the viral genome sequences deposited in GISAID from Japan during this period. These concerns might have been particularly substantial during the so-called “seventh wave” from June to October 2022, when repeated mutations of the Omicron variant caused a massive epidemic in Japan with a prevalence approximately 2.5 times higher than previous epidemic waves2 and a major impact on the public health measures12, causing some reporting failures in the surveillance system.
Genome-based surveillance of SARS-CoV-2 may have serious consequences owing to biases associated with notifiable surveillance. Surveillance sites tend to prioritize interesting cases, such as outbreaks and/or cluster infections, rather than sequencing all collected cases. Challenges in capacity for NGS, including maintenance costs and technical difficulties, can result in regional disparities. These biases may lead to overestimation or underestimation of the presentation of viral transmission lineages in phylogenetic tree analysis13,14. To address these challenges, implementing testing-based sentinel screening for symptomatic and asymptomatic individuals is a potential solution. This approach involves active testing, including genome sequencing, for all positive cases. Some countries have established testing-based sentinel surveillance programs, developed for rapid diagnosis and viral infection epidemics assessments15,16,17, including COVID-1918,19. Commercially available COVID-19 tests have been conducted in Japan. One of these diagnostic companies, the SB Coronavirus Inspection Center Corp. (SBCVIC), continuously sequences viral genomes from specimens with informed consent from the individual. They provide COVID-19 testing for workplace screening upon company requests and voluntary testing as requested by local governments. Therefore, the majority of positive cases are from asymptomatic individuals20. In this study, we analyzed viral gene variants during the seventh wave, comparing data from the GISAID, which were deposited through the NESID, and the SBCVIC to evaluate the utility of monitoring viral gene variants by combining PCR testing of asymptomatic individuals with viral genome sequencing.
Results
Genetic diversity of SARS-CoV-2 in the SBCVIC was similar to sequences in GISAID during the same period
According to CoVariants.org21, 21 K in the Nextstrain22 clade, specifically BA.1 in the PANGO lineage23, was initially prevalent in the sixth wave of COVID-19 infection in Japan in early 2022. Subsequently, 21L/BA.2 dominated in the late phase. In June 2022, the beginning of the study period, the sixth wave concluded (Fig. 1A). Starting from this period, BA.2 was gradually replaced by BA.5 (Fig. 1B). The seventh wave peaked in July–August 2022. After September, the prevalence of BA.5 decreased. Simultaneously, more evolved lineages were being detected (Fig. 1B). The genetic diversity of the full-length viral sequences from SARS-CoV-2 positive cases diagnosed at the SBCVIC remained constant throughout the study period. It resembled the diversity of sequences registered in GISAID from Japan between June and September, except for GISAID-registered sequences in October, which were approximately twice as diverse of the other monthly groups (Fig. 2). The lineages contributing to this genetic diversity initially included BA.2.3, BA.2.24, and other BA.2 s in June. However, they were gradually replaced by infected populations with BA.5.2, BF.5, and other BA.5 s starting from July (Fig. 3A). This composition was similar between the SBCVIC and GISAID. However, in October, 46% of the viral sequences from GISAID collected contained BA.2.3.20 and BA.2.75, which were absent in the SBCVIC data.
The waves COVID-19 and the dynamic evolution of SARS-CoV-2 variants in Japan. (A) Trends in the number of COVID-19 cases reported in Japan from May 2020 to April 2023 relying on the NESID reports. (B) Trends in the frequency of VOC and PANGO linages in Japan from May 2020 to April 2023 relying on CoVariants.org (https://covariants.org) data. The prevalent Pango lineage for Omicron is shown, with each month of the study period represented by blue brackets between panel A and B. VOC, variants of concern; NESID, The National Epidemiological Surveillance of Infectious Diseases.
Diverse genetic profiles of SARS-CoV-2 genome sequences collected using various routes across different months. The line graph depicts the average number of base substitutions per site, calculated from averaging over all sequence pairs within each sampling route per month group. Bars show the standard error estimates, which were estimated using the bootstrap method with 500 replications. Open and closed circles indicate diversity within the SB Coronavirus Inspection Center Corp. and GISAID samples from each month, respectively.
Viral lineage composition at the SBCVIC and GISAID entries during the seventh wave, and the concurrent COVID-19 prevalence in Japan. (A) A stacked bar graph illustrates the frequency distribution of viral lineages at the SBCVIC (left) and GISAID entries (right), whereas a line graph represents the number of cases reported in NESID during the corresponding period. (B) The number of cases for each viral lineage at the SBCVIC (left) and GISAID entries (right) are shown as stacked bar graphs. Concurrently, a line graph indicates the number of cases reported in NESID during the same period. NESID, The National Epidemiological Surveillance of Infectious Diseases.
Number of omicron variants diagnosed in the SBCVIC differed from that of GISAID and was consistent with the number of reported cases in Japan
The demographic information for all positive cases for which viral genome sequencing was attempted in the SBCVIC during the study period, June to October 2022, and which were recruited for our analysis, is shown in Supplementary Table S1. Although the number of viral full-genome sequences collected in Japan and registered in GISAID was as high as 887 in June, the number of sequences from the SBCVIC was much lower. The number of sequences from the SBCVIC increased after July, peaked at 762 in August, and subsequently decreased to 69 in October (Fig. 3B). The change in the number of sequences at the SBCVIC was consistent with the number of COVID-19 cases reported in Japan during the same period (Fig. 3B). Conversely, the number of GISAID entries from Japan decreased after June, with only 98 entries in August, the peak of the seventh wave. BA.5 strains began to be observed in early July; however, the cumulative number of GISAID cases slowed down thereafter. The dates on which SBCVIC could capture > 50% and > 90% of the total number of cases in November were 12 and 12 days earlier than in GISAID in BA.5.2, 17 and 10 days earlier in BA.5.2.1, and 15 and 36 days earlier in the other BA.5, respectively (Supplementary Fig. S1). Bayesian skyline plot analysis of the sequence data revealed that the relative population size trends inferred from SBCVIC-derived sequences were consistent with COVID-19 prevalence since the outbreak of Omicron variants in January. In contrast, the inference using GISAID-derived sequences showed a single wave of the sixth wave and no seventh wave (Supplementary Fig. S2).
Genetic differences in sequences from the SBCVIC and GISAID showed greater temporal divergence within each month than those across different collection routes
The gross genetic distance between groups of viral sequences collected by the SBCVIC and GISAID in their respective months was, at most, as small as 0.096% of the substitutions/sites (Supplementary Table S2). The maximum net genetic distance between the groups was 0.041% between SB-Sep and GISAID-Jun, and the ratio of the net to gross value (0.086%) was 47.5%, indicating that up to about 48% of the divergence was due to differences in months and collection routes. The net genetic distances between collection routes in the same month were often smaller than the differences between sampling months along the same route (Supplementary Table S2). A dendrogram illustrating the similarity between the groups showed that the groups were divided into two clusters (Fig. 4). One cluster consisted of populations primarily infected with BA.2 s and included the June SBCVIC, June GISAID, and July GISAID. The second cluster was primarily composed of populations infected with BA.5 s, including the majority of the seventh wave. The October GISAID group was at the top of this cluster, reflecting the high rate of BA.2.75 registrations during this period.
Cluster analysis of mean genetic distance between groups based on collection route and month. The relationship between viral genome differences among the collection routes in each month is shown in Supplementary Table S2 and illustrated in a dendrogram using the Ward method.
Viral variants from GISAID and the SBCVIC were mixed in the transmission network for BA.2, but were separate subclusters for BA.5
Transmission cluster and network analyses identified at least 60 components and 157 singleton cases of BA.2- (33/64 components/singletons) and BA.5-related (27/93) lineages detected during the study period in Japan. Among these components, 12 pairs were cases with BA.2 lineages, and 14 pairs were cases with BA.5 lineages (Supplementary Fig. S3A). The BA.2-related components were primarily composed of GISAID entries (Supplementary Fig. S3B), and the SBCVIC cases were located at different locations in both the phylogenetic clusters (Supplementary Fig. S4) and the components (Fig. 5A). The network graph of BA.2 exhibited lower densities (median = 0.202) than BA.5 (0.603); however, the difference was not significant for the Mann–Whitney U test (p = 0.157). Among the 12 BA.2 components containing > 5 cases, the largest one with BA.2.3–2.3.20 consisted of several smaller clusters and two main clusters. One of these clusters included GISAID-derived BA.2.3 cases from August onwards, BA.2.3.20 cases in October, and earlier cases from the SBCVIC. In another large component, BA.2.3.13, two clusters of GISAID entries from different months were observed, with individual divergence of SBCVIC-derived cases. The BA.5-related components of the seventh wave featured numerous cases from the SBCVIC (Supplementary Fig. S3B), with GISAID entries often forming a subcluster separate from the SBCVIC in certain network components (Fig. 5B). Of the 11 BA.5 components containing > 5 cases, most appeared to form separate clusters of SB- and GISAID-derived cases, except for BA.5.2.1. BA.5.2 primarily consisted of three clusters of SB-derived cases, divided by central GISAID entries from August to September. BA.5.1 had more GISAIDs than the other components, and the GISAID cluster was divided into two areas: one collected in July and the other in September. BF.5 distinctly separated the SBCVICs and GISAIDs without exhibiting clear temporal characteristics.
Transmission network of viruses collected during the seventh wave in Japan. (A) Estimated transmission links for cases belonging to BA.2 with genetic distances < 0.015% are shown. Each node in the network graph indicates a case infected with the BA-2 virus. Node colors in the graph denote collection routes and months. Blue and red indicate cases derived from the SBCVIC and GISAID, respectively. The darkness of the color reflects the recency of the collection month. (B) Estimated transmission links for cases belonging to BA.5 with genetic distances < 0.015% are shown. Each node in the network graph represents a case infected with the BA-5 virus. The color of the node is the same as described in (A). d indicates the graph density of each network.
Discussion
A considerable proportion of asymptomatic10,11,24 and presymptomatic transmissible periods24,25 facilitates early detection of SARS-CoV-2 using traditional syndromic-based surveillance, widely implemented in several countries. Although such type of surveillance has the advantage of providing generalized data on disease incidence and mortality26, reporting all symptomatic cases to a surveillance agency is a burden on health workers and facilities26. The requirement for viral genomic information has notably increased this burden. Relying on symptomatic cases is essentially ineffective in detecting early COVID-19 outbreaks, which are suggested to have a cryptic transmission phase preceding infection waves27,28. An alternative approach to detecting cryptically transmissible viruses is testing-based sentinel surveillance with genome sequencing, including asymptomatic individuals. This approach could monitor epidemic indices efficiently and be more cost-effective than other approaches26. Although the screening test for SARS-CoV-2 conducted by the SBCVIC was not designed to monitor the incidence of COVID-19, it provides information for active sentinel viral genome surveillance in Japan.
To implement this approach, it is necessary to ensure that viruses in the sample population reflect the genetic diversity of the original population. Sentinel surveillance of SARS-CoV-2 genomic data has provided more epidemiologically representative data18, earlier reports of variant shifts29 and identification of novel imported variants30 than those of conventional systems. The available data from the SBCVIC were more representative of the seventh COVID-19 epidemic wave in Japan than the official surveillance during the same period. Notably, the SBCVIC managed to collect cases reflecting the trend in the number of positive individuals reported to NESID, and the viral sequences collected for four months until September 2022 exhibited nucleotide diversity within the tested population comparable to that of GISAID entries. These results indicated the adequacy of the data for monitoring SARS-CoV-2 genetic variation. Despite small sample sizes and regional coverage limitations in Japan, the data analyzed at the SBCVIC may still be appropriate for monitoring viral lineages.
Genetic differences among monthly collection groups in each collection route showed that SBCVIC-based surveillance could capture lineage shifts earlier than GISAID sequences, especially during the phase of the wave when the major variants transitioned from BA.2 to BA.5. The fact that SBCVIC captured a half and 90% of BA.5 variants on average 15 and 19 days earlier than GISAID, respectively, confirms that the shift could be observed about a half a month earlier, in SBCVIC. The viral variant shift was also observed in previous studies involving on-site genome sequencing in healthcare workers29,31. This supports our findings, suggesting the benefit of combining testing-based screening and genomic analysis for SARS-CoV-2 variant surveillance.
In our analysis, the viral population of the seventh wave in Japan consisted of at least 60 components in transmission networks, further subdivided into subpopulations. SARS-CoV2 globally spread following a social network with a scale-free property32,33. Several components were singletons or pairs, and few had many cases, aligning with most viral epidemics34,35,36,37. As GISAID entries of BA.2 reflected various infections in Japan and the BA.2 cases from the SBCVIC were scattered in large BA.2 network components dominated by GISAID entries, the SBCVIC capable of capturing a substantial range of BA.2 lineage diversity. In contrast, in BA.5 lineages, GISAID entries often formed subclusters separated from SBCVIC cases. The BA.5 epidemic gained momentum from August onwards. The declining number of BA.5 entries in GISAID from Japan since August could be attributed to the surveillance burden on official testing laboratories at local public health sections. Given this substantial burden, it would be reasonable to cluster GISAID entries if analyses were solely performed on specific infected populations of interest. This indicated that the authorities may have failed to understand the current status of the epidemic. The distribution of GISAID entries throughout the network of SBCVIC cases in BA.5.2.1, and the observation of a small component consisting only of GISAID entries, highlights the need for nationwide implementation of viral genomic surveillance.
Genome-testing-based sentinel surveillance, targeting corporate employees without relying on symptoms as an indicator, can detect newly introduced variants earlier than retrospective GISAID analysis. GISAID entries result from symptom-reporting surveillance, and our study suggests that genome testing in specific corporate groups is a more proactive approach to identifying new variants.
Limited by data availability, our comparison focuses on GISAID genomic information as of April 2023, omitting some national surveillance data. A potential drop in GISAID registrations after August 2022 could be attributed to fewer Japanese registrations rather than a decline in sequence analyses, affecting the representativeness of our results compared to national government data. In addition, SBCVIC linked positive individuals to a health care provider for a confirmatory diagnosis and subsequent treatment; however, it did not follow up on their disease course; therefore, some may have been presymptomatic cases. Although the asymptomatic rate may be lower than in NESID, where all cases are symptomatic, another limitation of this study is that the exact difference in asymptomatic rates is not known. Another limitation is that SBCVIC did not necessarily diagnose and collect data from all geographically transmitted populations in Japan and we could only analyze samples from 24 out of the 47 prefectures in Japan. Implementing a nationwide surveillance system could have facilitated the early detection of the BA.5 outbreak in seventh wave and subsequent epidemic. A well-designed genome-based sentinel surveillance is more effective at detecting genomic data than a comprehensive reporting system18. In such a context, the viral genome information based on SBCVIC could reveal the variant shift earlier, although it includes presymptomatic individuals. Sentinel screening tests in asymptomatic populations are effective in preventing infection19,20,38. Their implementation, especially in the era of highly cryptic infections like Omicron-related variants, should be considered, owing to their low facility burden.
Conclusions
-
Viral genome surveillance using the SBCVIC testing system effectively observed genetic diversity in SARS-CoV-2 variants during the seventh wave in Japan.
-
The number of sequences analyzed by the SBCVIC was consistent with the reported number of positive individuals in the national surveillance system.
-
The SBCVIC cases detected the lineage shift from BA.2 to BA.5 earlier than GISAID entries in the retrospective analysis.
-
In the BA.2 network, GISAID entries dominated, whereas SBCVIC sequences were scattered across different locations in the network.
-
The BA.5 network components, primarily from the SBCVIC, displayed high density, whereas GISAID entries often formed distinct a subcluster separate from SBCVIC cases.
-
Testing-based sentinel surveillance of COVID-19 in asymptomatic individuals may be more efficient and less burdensome than notifiable surveillance using sequence analysis.
-
Further studies should aim to develop rapid and efficient genome-based surveillance systems for the next emerging infectious diseases.
Methods
Design of commercially based COVID-19 testing and screening
From July 27, 2020, the SBCVIC has conducted workplace screening and local government-based voluntary COVID-19 testing using RT-PCR for > 2.8 million individuals without self-reported symptoms, following a protocol outlined in a previous report20. Briefly, self-collected and mailed saliva samples were tested using the SARS-CoV-2 Direct Detection RT-qPCR Kit (RC30JW; Takara Bio, Shiga, Japan) on the day of arrival at the SBCVIC. Samples with cycle threshold (Ct) values ≤ 40 were considered positive for COVID-19 infection, and almost all of them underwent viral genome sequencing using NGS with the ARTIC primer set and Illumina COVIDseq Test (Illumina Inc., CA, USA). NGS results were analyzed using the Illumina DRAGEN COVIDseq Test (RUO), and a consensus sequence was used for subsequent analyses.
Ethics approval
All participants provided written informed consent, allowing for the collection of saliva samples, viral sequencing, and demographic information for the study. Participants had the option to decline participation in the study via the website. This study was approved by the Institutional Review Board of the National Center for Global Health and Medicine (NCGM-G-003678-00). The study was conducted in accordance with the Declaration of Helsinki (revised in 2013).
Data collection
We studied Omicron-infected cases in Japan during the seventh wave of the COVID-19 epidemic from June to October 2022, including a month before and after the epidemic (Fig. 1A). First, we identified whole-genome sequences of Omicron variants of concern derived from samples with Ct ≤ 30 from 7004 samples in the SBCVIC database, resulting in the selection of 2,518 sequences. Additionally, we retrieved 4,958 complete sequences from GISAID, collected in Japan from January to October 2022, and registered them as of April 11, 2023, via gisaid.org/EPI_SET_230609fz. We also retrieved Wuhan/Hu-1/2019, the first reported SARS-CoV-2 complete sequence, from the National Center for Biotechnology Information GenBank (accession number: NM908947). From this dataset, we extracted 1,248 sequences with a read length ≥ 29,000 bases of complete nucleotides and 1,764 sequences from the SBCVIC and GISAID entries collected since June, respectively. These sequences, along with Wuhan/Hu-1, were aligned using MAFFT version 7.539 to create the seventh wave dataset, which was used to estimate nucleotide diversity and genetic distance. Meanwhile, 300 sequences were randomly extracted for each sampling route from both the comprehensive alignment data containing all cases since January (Omicron wave dataset) and the seventh wave dataset. Bayesian Markov Chain Monte Carlo analysis was performed for each set. The accession numbers of the sequences from the SBCVIC are BS009006-BS010295.
Molecular evolutionary analyses
Prior to the analyses, we masked seven sites of concern for artificial mutations40 by replacing the corresponding nucleotide in Wuhan/Hu-1 with an ‘N.’ Additionally, we excluded the first 100 bases and the last 200 bases of the alignment, where sequencing accuracy is poor. First, we identified the PANGO lineage of the infected virus using Pangolin version 4.223. Thereafter, we estimated the number of base substitutions per site among all pairs of sequences, both gross and net averages between the monthly or collection route groups of sequences, and averages within the groups in the seventh wave alignment. The Tamura-Nei 93 model with gamma-distributed sites was used for these calculations using MEGA1141. Hierarchical clustering was computed using a distance matrix of the net average substitutions and plotted as a Ward method dendrogram using R 4.0.242. To infer the transmission network of each lineage prevalent in Japan during the study period, we analyzed the pairwise substitution matrix according to previous studies on HIV-143,44. We linked two cases (nodes) in the network when their viral sequences were < 0.015% distance in the substitution matrix, equivalent to approximately 30% of the mean diversity in the entire population in the seventh wave dataset (d = 0.00056). The resulting adjacent matrices were transformed into network graphs and their components using the SNA package in R. Bayesian MCMC clade credibility phylogenies for the Omicron and seventh wave datasets were inferred using BEAST 1.945 with GTR + G + I substitution and Bayesian skyline plot population models. The relative population dynamics of the Omicron epidemics in Japan from January 2022 were estimated using the Coalescent Bayesian Skyline Plot46.
Statistical analysis
Data regarding the month of collection, collection route, and PANGO lineage were compiled using Microsoft Excel. All epidemiological data analyses, including the Mann–Whitney U test and standard error of means, were performed using R version 4.0.2.
Data availability
Both the next generation sequence reads and the assembled sequences can be downloaded from databases on DDBJ, EMBL-EBI and NCBI, which participate in INSDC (https://www.insdc.org/). The accession numbers and dates of collection of the viral sequences from SBCVIC are listed in Supplementary Table S3.
References
WHO. Japan: WHO Coronavirus Disease (COVID-19) Dashboard with Vaccination Data. https://covid19.who.int/region/wpro/country/jp (2023).
Ministry of Health, Labour and Welfare, Japan. Novel Coronavirus (COVID-19). https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/0000164708_00079.html.
Kitahara, K., Nishikawa, Y., Yokoyama, H., Kikuchi, Y. & Sakoi, M. An overview of the reclassification of COVID-19 of the infectious diseases control law in Japan. Glob. Health Med. 5, 70–74 (2023).
National Institute of Infectious Diseases, Japan. Summary on SARS-CoV-2 variants of concern for increased infectivity/transmissibility and antigenic changes (No. 26). https://www.niid.go.jp/niid/en/2019-ncov-e/2551-cepr/11909-summary-on-sars-cov-2-variants-of-concern-for-increased-infectivity-transmissibility-and-antigenic-changes-no-26-en-2.html (2023)
Khare, S. et al. GISAID’s role in pandemic response. China CDC Wkly. 3, 1049–1051 (2021).
Elbe, S. & Buckland-Merrett, G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob. Chall. 1, 33–46 (2017).
Shu, Y. & McCauley, J. GISAID: Global initiative on sharing all influenza data – from vision to reality. Eurosurveillance. 22, 30494 (2017).
Ferdinand, A. S. et al. An implementation science approach to evaluating pathogen whole genome sequencing in public health. Genome Med. 13, 121 (2021).
German, R. R. et al. Updated guidelines for evaluating public health surveillance systems: Recommendations from the Guidelines Working Group. MMWR Recomm. Rep. 50, 1–35 (2001).
Buitrago-Garcia, D. et al. Occurrence and transmission potential of asymptomatic and presymptomatic SARSCoV-2 infections: A living systematic review and meta-analysis. PLoS Med. 17, e1003346 (2020).
Casey-Bryars, M. et al. Presymptomatic transmission of SARS-CoV-2 infection: A secondary analysis using published data. BMJ Open. 11, e041240 (2021).
Furuse, Y. Properties of the omicron variant of SARS-CoV-2 affect public health measure effectiveness in the COVID-19 epidemic. Int. J. Environ. Res. Public Health. 19, 4930 (2022).
Lemey, P. et al. Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2. PLoS Pathog. 10, e1003932 (2014).
Magee, D. & Scotch, M. The effects of random taxa sampling schemes in Bayesian virus phylogeography. Infect. Genet. Evol. 64, 225–230 (2018).
Edmond, M., Wong, C. & Chuang, S. K. Evaluation of sentinel surveillance system for monitoring hand, foot and mouth disease in Hong Kong. Public Health. 125, 777–783 (2011).
Nuvey, F. S. et al. Evaluation of the sentinel surveillance system for influenza-like illnesses in the Greater Accra region, Ghana, 2018. PLoS One. 14, e0213627 (2019).
Babakazo, P. et al. Evaluation of the influenza sentinel surveillance system in the Democratic Republic of Congo, 2012–2015. BMC Public Health. 19, 1652 (2019).
Oltean, H. N. et al. Sentinel Surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. Emerg. Infect. Dis. 29, 242–251 (2023).
Ginige, S. et al. Protocol for a winter sentinel surveillance program of notifiable respiratory viruses in Queensland. PLoS One. 17, e0277895 (2022).
Terada-Hirashima, J. et al. Investigation of the use of PCR testing prior to ship boarding to prevent the spread of SARS-CoV-2 from urban areas to less populated remote islands. Glob. Health Med. 4, 174–179 (2022).
Hodcroft, E.B. CoVariants: SARS-CoV-2 mutations and variants of interest. https://covariants.org/ (2021).
Hadfield, J. et al. NextStrain: Real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018).
O’Toole, Á. et al. Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evol. 7, veab064 (2021).
Furukawa, N. W., Furukawa, N. W., Brooks, J. T. & Sobel, J. Evidence supporting transmission of severe acute respiratory syndrome coronavirus 2 while presymptomatic or asymptomatic. Emerg. Infect. Dis. 26, e201595 (2020).
Bae, S., Lim, J. S., Kim, J. Y., Jung, J. & Kim, S. H. Transmission characteristics of Sars-Cov-2 that hinder effective control. Immune Netw. 21, e9 (2021).
Murray, J. & Cohen, A.L. Infectious Disease Surveillance. In International Encyclopedia of Public Health, pp. 222–229 (Elsevier Inc., 2016).
Davis, J. T. et al. Cryptic transmission of SARS-CoV-2 and the first COVID-19 wave. Nature. 600, 127–132 (2021).
Nabeshima, T. et al. COVID-19 cryptic transmission and genetic information blackouts: Need for effective surveillance policy to better understand disease burden. Lancet Reg, Health. West Pac. 7, 100104 (2021).
Padilha, D. A. et al. Genomic surveillance of SARS-CoV-2 in healthcare workers: A critical sentinel group for monitoring the SARS-CoV-2 variant shift. Viruses. 15, 984 (2023).
Sekizuka, T. et al. COVID-19 genome surveillance at international airport quarantine stations in Japan. J. Travel. Med. 28, taaa217 (2021).
Borges, V. et al. Nosocomial outbreak of Sars-Cov-2 in a “non-COVID-19” hospital ward: Virus genome sequencing as a key tool to understand cryptic transmission. Viruses. 13, 604 (2021).
Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Pastor-Satorras, R. & Vespignani, A. Epidemic dynamics and endemic states in complex networks. Phys. Rev. E. 63, 066117 (2001).
Shiino, T. Phylodynamic analysis of a viral infection network. Front. Microbiol. 3, 278 (2012).
Lewis, F., Hughes, G.J., Rambaut, A., Pozniak, A. & Leigh Brown, A. J. Episodic sexual transmission of HIV revealed by molecular phylodynamics. PLoS Med. 5, e50 (2008).
Hughes, G.J., Fearnhill, E., Dunn, D., Lycett, S.J., Rambaut, A. & Leigh Brown, A.J. Molecular phylodynamics of the heterosexual HIV epidemic in the United Kingdom. PLoS Pathog. 5, e1000590 (2009).
Romano, C. M. et al. Social networks shape the transmission dynamics of hepatitis C virus. PLoS One. 5, e11170 (2010).
Hoch, M. et al. Weekly SARS-CoV-2 sentinel surveillance in primary schools, kindergartens, and nurseries, Germany, June-November 2020. Emerg. Infect. Dis. 27, 2192–2196 (2021).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Sanderson, T. & Barrett, J. C. Variation at Spike position 142 in SARS-CoV-2 Delta genomes is a technical artifact caused by dropout of a sequencing amplicon. Wellcome Open Res. 6, 305 (2021).
Tamura, K., Stecher, G. & Kumar, S. MEGA11: Molecular evolutionary genetics analysis version 11. Mol. Biol. Evol. 38, 3022–3027 (2021).
Ihaka, R. & Gentleman, R. R: A language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996).
Smith, D. M. et al. A public health model for the molecular surveillance of HIV transmission in San Diego California. AIDS. 23, 225–232 (2009).
Little, S. J. et al. Using HIV networks to inform real time prevention interventions. PLoS One. 9, e98443 (2014).
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).
Drummond, A. J., Rambaut, A., Shapiro, B. & Pybus, O. G. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol. Biol. Evol. 22, 1185–1192 (2005).
Acknowledgements
We thank all data contributors, i.e., the authors and their origin laboratories responsible for obtaining the specimens, and their submission laboratories for generating the genetic sequence and metadata. Their collaboration through the GISAID Initiative forms the basis of this research. We also thank Yuichi Kawanishi for submitting the viral gene sequences analyzed by SBCVIC to The International Nucleotide Sequence Database Collaboration (INSDC) through DNA Databank of japan. Finally, we thank the staff of the SB Coronavirus Inspection Center Corp. for their valuable assistance in conducting the testing and managing DNA sequences.
Funding
This research was supported by the SB Coronavirus Inspection Center Corp.
Author information
Authors and Affiliations
Contributions
T.S. designed and performed the analyses and wrote the manuscript. J.S.T. and H.O. managed the sequence data from SBCVIC. M.K. directed the research contract with the SBCVIC and organized the intellectual property. Y.K. were responsible for PCR testing with the viral gene sequencing at SBCVIC. W.S. and M.I conceived an inspection system by SBCVIC and acquired research funding. All authors critically reviewed and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
Moto Kimura and Wataru Sugiura have received research grants from SB Coronavirus Inspection Center Corp. Masato Ikeda and Yukumasa Kazuyama are employees of the SB Coronavirus Inspection Center Corp.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shiino, T., Takeuchi, J.S., Ohyanagi, H. et al. Molecular epidemiology of SARS-CoV-2 genome sentinel surveillance in commercial COVID-19 testing sites targeting asymptomatic individuals during Japan’s seventh epidemic wave. Sci Rep 14, 20950 (2024). https://doi.org/10.1038/s41598-024-71953-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-71953-8