Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Jun 16, 2021
Date Accepted: May 17, 2022
Date Submitted to PubMed: May 23, 2022

The final, peer-reviewed published version of this preprint can be found here:

Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation

Stockham N, Washingon P, Chrisman B, Paskov K, Jung JY, Wall DP

Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation

JMIR Public Health Surveill 2022;8(7):e31306

DOI: 10.2196/31306

PMID: 35605128

PMCID: 9307267

Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation

  • Nathaniel Stockham; 
  • Peter Washingon; 
  • Brianna Chrisman; 
  • Kelley Paskov; 
  • Jae-Yoon Jung; 
  • Dennis P. Wall

ABSTRACT

Background:

Selection bias and unmeasured confounding are fundamental problems in epidemiology that threaten study internal and external validity. These phenomena are particularly dangerous in internet-based public health surveillance where traditional mitigation and adjustment methods are inapplicable, unavailable, or out of date. Recent theoretical advances in causal modeling can mitigate these threats, but these innovations have not been widely deployed in the epidemiological community.

Objective:

The purpose of our paper is to demonstrate the practical utility of causal modeling to detect unmeasured confounding and selection bias through nonparametric graphical and statistical criteria; furthermore these criteria can be used to falsify proposed models to aid epidemiological model selection to minimize estimate bias. We implement this approach in a real world epidemiological study of COVID-19 cumulative infection rate in the New York City Spring 2020 epidemic.

Methods:

We collected primary data from Qualtrics surveys of Amazon Mechanical Turk crowd workers residing in New Jersey and New York state across two sampling periods; April 11-14th and May 8-11th 2020. The surveys queried the subjects on household health status and demographic characteristics.

Results:

There were 527 and 513 responses collected for each period. Response demographics were highly skewed toward younger age in both survey periods. Despite the extremely strong relationship between age and COVID-19 symptoms we recovered minimally biased estimates of cumulative infection rate using only primary data, with a relative bias from officially reported rate of +3.8% and -0.7% for the first and second survey periods.

Conclusions:

We successfully recovered accurate estimates of a key epidemiological parameter from an internet-based crowd sourced sample despite considerable selection bias and unmeasured confounding in the primary data. This real world implementation demonstrates how simple applications of structural causal modeling can be effectively used to determine falsifiable model conditions, detect selection bias and confounding factors, and minimize estimate bias through model selection in a novel epidemiological context.


 Citation

Please cite as:

Stockham N, Washingon P, Chrisman B, Paskov K, Jung JY, Wall DP

Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation

JMIR Public Health Surveill 2022;8(7):e31306

DOI: 10.2196/31306

PMID: 35605128

PMCID: 9307267

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement