Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 30, 2020
Date Accepted: Mar 11, 2021
Date Submitted to PubMed: Apr 9, 2021

The final, peer-reviewed published version of this preprint can be found here:

A Multimodality Machine Learning Approach to Differentiate Severe and Nonsevere COVID-19: Model Development and Validation

Chen Y, Ouyang L, Bao FS, Li Q, Han L, Zhang H, Zhu B, Ge Y, Robinson P, Xu M, Liu J, Chen S

A Multimodality Machine Learning Approach to Differentiate Severe and Nonsevere COVID-19: Model Development and Validation

J Med Internet Res 2021;23(4):e23948

DOI: 10.2196/23948

PMID: 33714935

PMCID: 8030658

Accurate Severe vs Non-severe COVID-19 Clinical Type Classification: a Multimodality Machine Learning Study

  • Yuanfang Chen; 
  • Liu Ouyang; 
  • Forrest S. Bao; 
  • Qian Li; 
  • Lei Han; 
  • Hengdong Zhang; 
  • Baoli Zhu; 
  • Yaorong Ge; 
  • Patrick Robinson; 
  • Ming Xu; 
  • Jie Liu; 
  • Shi Chen

ABSTRACT

Background:

Effectively and efficiently diagnosing COVID-19 patients with accurate clinical type is essential to achieve optimal outcomes for the patients as well as reducing the risk of overloading the healthcare system. Currently, severe and non-severe COVID-19 types are differentiated by only a few features, which do not comprehensively characterize the complicated pathological, physiological, and immunological responses to SARS-CoV-2 invasion in different types. In addition, these type-defining features may not be readily testable at time of diagnosis.

Objective:

This study aimed to accurately differentiate severe and non-severe COVID-19 clinical types based on multiple medical features and provide reliable predictions for clinical decision support.

Methods:

In this study, we recruited 214 confirmed COVID-19 patients in non-severe and 148 in severe type. The patients’ clinical (including 26 features), and laboratory testing results (26 features) upon admission were acquired as two input modalities. Exploratory analyses demonstrated that these features differed substantially between two clinical types. Machine learning random forest (RF) models based on all features in each modality as well as top 5 features in each modality combined were developed and validated to differentiate COVID-19 clinical types.

Results:

Using clinical and laboratory results as input independently, RF models achieved 90% and 95% predictive accuracy, respectively. Input features’ importance scores were further evaluated and top five features from each modality were identified (age, hypertension, cardiovascular disease, gender, diabetes; D-Dimer, hsTNI, absolute neutrophil count, IL-6, and LDH, in descending order). Using these top 10 multimodal features as the only input instead of all 52 features combined, RF model was able to achieve 99% predictive accuracy.

Conclusions:

These findings shed light on how the human body reacts to SARS-CoV-2 invasion as a unity and provide insights on effectively evaluating COVID-19 patient’s severity based on more common medical features when gold-standard features were not available. We suggest that clinical information can be used as an initial screening tool for self-evaluation and triaging, while laboratory testing results are applied when accuracy is the priority.


 Citation

Please cite as:

Chen Y, Ouyang L, Bao FS, Li Q, Han L, Zhang H, Zhu B, Ge Y, Robinson P, Xu M, Liu J, Chen S

A Multimodality Machine Learning Approach to Differentiate Severe and Nonsevere COVID-19: Model Development and Validation

J Med Internet Res 2021;23(4):e23948

DOI: 10.2196/23948

PMID: 33714935

PMCID: 8030658

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement