Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Apr 20, 2020
Date Accepted: Jul 19, 2020
Date Submitted to PubMed: Jul 21, 2020
Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns
ABSTRACT
Background:
Timely allocation of medical resources for COVID-19 requires early detection of regional outbreaks. Internet browsing data, such as search activity levels, may provide predictive ability for estimating cases in a local population that are yet to be confirmed.
Objective:
The objective of our study was to determine whether search-engine query patterns can forecast COVID-19 case rates at the state and local levels in the United States.
Methods:
We used regional confirmed case data from the New York Times and Google Trends results from 50 states and 203 county-based designated market areas (DMA). We identified search terms whose activity precedes and correlates with confirmed case rates at the national level, using univariate regression to construct a composite explanatory variable based on top-scoring search queries offset by temporal lags. We measured the correlation of the explanatory variable with out-of-sample case rate data at the state and DMA level.
Results:
Forecasts were highly correlated with confirmed case rates at the state and local level, using search data available up to 10 days in advance of confirmed case rates. They predicted case activity in 49 of 50 states and in 128 of 203 DMA at a significance level of .05 and were robust to differences in regional location, population, and date of outbreak.
Conclusions:
Identifiable patterns in search query activity may be used to forecast emerging regional outbreaks of COVID-19.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.