Eleven routine clinical features predict COVID-19 severity uncovered by machine learning of longitudinal measurements

https://doi.org/10.1016/j.csbj.2021.06.022Get rights and content
Under a Creative Commons license
open access

Abstract

Severity prediction of COVID-19 remains one of the major clinical challenges for the ongoing pandemic. Here, we have recruited a 144 COVID-19 patient cohort, resulting in a data matrix containing 3,065 readings for 124 types of measurements over 52 days. A machine learning model was established to predict the disease progression based on the cohort consisting of training, validation, and internal test sets. A panel of eleven routine clinical factors constructed a classifier for COVID-19 severity prediction, achieving accuracy of over 98% in the discovery set. Validation of the model in an independent cohort containing 25 patients achieved accuracy of 80%. The overall sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were 0.70, 0.99, 0.93, and 0.93, respectively. Our model captured predictive dynamics of lactate dehydrogenase (LDH) and creatine kinase (CK) while their levels were in the normal range. This model is accessible at https://www.guomics.com/covidAI/ for research purpose.

Abbreviations

PPV
positive predictive value
NPV
negative predictive value
LDH
lactate dehydrogenase
CK
creatine kinase
CT
computed tomography
CRP
C-reactive protein
RT-PCR
reverse transcriptase -polymerase chain reaction
HIS
hospital information system
LOS
length of stay
ESR
erythrocyte sedimentation rate
PCT
procalcitonin
CFDA
China Food and Drug Administration
LOESS
locally estimated scatterplot smoothing
GA
genetic algorithm
SVM
support vector machine
SHAP
SHapley Additive exPlanations
BASO#
basophil counts
AST
aspartate aminotransferase
Mg
magnesium
GGT
gamma glutamyl transpeptidase
APTT
activated partial thromboplastin time
SaO2
oxygen saturation
ROC
receiver operating characteristics
AUC
area under the curve
NETs
neutrophil extracellular traps
TT
thrombin time
LAC
lactate
ABG
arterial blood gas
eGFR
estimated glomerular filtration rate

Keywords

COVID-19
SARS-CoV-2
Severity prediction
Machine learning
Routine clinical test
Longitudinal dynamics

Cited by (0)

1

Co-first.