Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jun 20, 2021
Date Accepted: Sep 15, 2021
Date Submitted to PubMed: Sep 28, 2021
Predicting COVID-19 related healthcare resource utilization across a statewide patient population
ABSTRACT
Background:
The COVID-19 pandemic has highlighted the inability of health systems to leverage existing system infrastructure to rapidly develop and apply broad analytical tools that could inform state and national-level policymaking as well as patient care delivery at hospital settings. COVID-19 has also led to highlighted systemic disparities in health outcomes and access to care based on race/ethnicity, gender, income-level and urban-rural divide. While the US seems to be recovering from the COVID-19 pandemic due to widespread vaccination efforts and increased public awareness, there is an urgent need to address the aforementioned challenges.
Objective:
Inform the feasibility of leveraging broad, statewide datasets for population-health driven decision making by developing robust analytical models that predict COVID-19 related healthcare resource utilization across patients served by Indiana’s statewide Health Information Exchange (HIE).
Methods:
We leveraged comprehensive datasets obtained from the Indiana Network for Patient Care (INPC) to train decision forest-based models that predicted patient-level need of healthcare resource utilization. To assess models for potential biases, we tested model performance against sub-populations stratified by age, race/ethnicity, gender, and residence (urban vs. rural).
Results:
We identified a cohort of 96,190 patients from 957 zip codes spread across the state of Indiana. We trained decision models that predicted healthcare resource utilization using the most impactful features (~100) out of a total of 1172 features created. Each model and stratified sub-population under test reported precision scores > 70%, accuracy and AUC ROC scores > 80%, and sensitivity scores ~>90%. We noted statistically significant variations in model performance across stratified sub-populations identified by age, race/ethnicity, gender, and residence (urban vs. rural).
Conclusions:
This study presents the possibility of developing decision models capable of predicting patient-level healthcare resource utilization across a broad statewide region with considerable predictive performance. However, our models present statistically significant variations in performance across stratified sub-populations of interest. Further efforts are necessary to identify root causes of these biases and to rectify them. Clinical Trial: NA
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.