Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients

Daramola, Olawande; Kavu, Tatenda Duncan; Kotze, Maritha J.; Marnewick, Jeanine L.; Sarumi, Oluwafemi A.; Kabaso, Boniface; Moser, Thomas; Stroetmann, Karl; Fwemba, Isaac; Daramola, Fisayo; Nyirenda, Martha; van Rensburg, Susan J.; Nyasulu, Peter S.

doi:10.1038/s41598-023-46712-w

Download PDF

Article
Open access
Published: 16 January 2025

Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients

Olawande Daramola¹,
Tatenda Duncan Kavu¹,
Maritha J. Kotze^3,4,
Jeanine L. Marnewick⁵,
Oluwafemi A. Sarumi⁸,
Boniface Kabaso¹,
Thomas Moser⁶,
Karl Stroetmann⁷,
Isaac Fwemba²,
Fisayo Daramola²,
Martha Nyirenda²,
Susan J. van Rensburg³ &
…
Peter S. Nyasulu²

Scientific Reports volume 15, Article number: 2184 (2025) Cite this article

1918 Accesses
Metrics details

Subjects

Abstract

South Africa was the most affected country in Africa by the coronavirus disease 2019 (COVID-19) pandemic, where over 4 million confirmed cases of COVID-19 and over 102,000 deaths have been recorded since 2019. Aside from clinical methods, artificial intelligence (AI)-based solutions such as machine learning (ML) models have been employed in treating COVID-19 cases. However, limited application of AI for COVID-19 in Africa has been reported in the literature. This study aimed to investigate the performance and interpretability of several ML algorithms, including deep multilayer perceptron (Deep MLP), support vector machine (SVM) and Extreme gradient boosting trees (XGBoost) for predicting COVID-19 mortality risk with an emphasis on the effect of cross-validation (CV) and principal component analysis (PCA) on the results. For this purpose, a dataset with 154 features from 490 COVID-19 patients admitted into the intensive care unit (ICU) of Tygerberg Hospital in Cape Town, South Africa, during the first wave of COVID-19 in 2020 was retrospectively analysed. Our results show that Deep MLP had the best overall performance (F1 = 0.92; area under the curve (AUC) = 0.94) when CV and the synthetic minority oversampling technique (SMOTE) were applied without PCA. By using the Shapley Additive exPlanations (SHAP) model to interpret the mortality risk predictions, we identified the Length of stay (LOS) in the hospital, LOS in the ICU, Time to ICU from admission, days discharged alive or death, D-dimer (blood clotting factor), and blood pH as the six most critical variables for mortality risk prediction. Also, Age at admission, Pf ratio (PaO2/FiO2 ratio), troponin T (TropT), ferritin, ventilation, C-reactive protein (CRP), and symptoms of acute respiratory distress syndrome (ARDS) were associated with the severity and fatality of COVID-19 cases. The study reveals how ML could assist medical practitioners in making informed decisions on handling critically ill COVID-19 patients with comorbidities. It also offers insight into the combined effect of CV, PCA, and SMOTE on the performance of ML models for COVID-19 mortality risk prediction, which has been little explored.

Machine learning based prediction models for the prognosis of COVID-19 patients with DKA

Article Open access 21 January 2025

Multivariable mortality risk prediction using machine learning for COVID-19 patients at admission (AICOVID)

Article Open access 17 June 2021

Machine learning-based prediction of in-ICU mortality in pneumonia patients

Article Open access 17 July 2023

Introduction

The effect of the coronavirus disease 2019 (COVID-19) pandemic on healthcare systems all over the world has been devastating¹. As a result, various clinical intervention methods have been employed in the detection, diagnosis, and prognosis of COVID-19 cases, including clinical/laboratory methods and medical image-based diagnosis^2,3. Lately, the use of artificial intelligence (AI) methods for tackling the challenges of COVID-19 has received significant attention. Some of the efforts so far reported include the application of machine learning (ML) as a subbranch of AI to detect COVID-19 infections^2,4,5, the prognosis of COVID-19 progression^6,7,8,9,10, and the determination of COVID-19 treatment outcomes^11,12,13,14. However, limited instances of applying AI to the prognosis of COVID-19 cases in Africa have been reported in the literature.

South Africa recorded the highest number of COVID-19 cases in Africa (http://www.worldometers.info/coronavirus). A large percentage of the population also has pre-existing comorbidities such as human immunodeficiency virus (HIV), tuberculosis (TB), and diabetes. comorbidities in patients with COVID-19 complicate prevention strategies and are more difficult to manage

¹⁵. AI-enabled decision-making for treating COVID-19 can enhance healthcare quality, particularly in settings with a paucity of medical expertise¹⁶. However, this is not common in many African countries. Healthcare practitioners mostly rely on observations from clinical examinations of patients and their own experiences when making clinical decisions.

Recently, many instances of prediction and prognosis of COVID-19 using ML have been reported in the literature. ML algorithms have been shown to possess the ability to predict COVID-19 outcomes accurately. Specific areas of application of ML include the prediction of future situation of COVID-19 regarding emergence of new cases (infection), deaths within a country or across regions^4,5,13,17,18; detection of COVID-19 infection, including prediction of COVID-19 outcome (death/survival) of critically ill patients^{12,19,20,21,22}; the use of X-ray images^2,23,24,25; and prediction of mortality risk and determination of the prognostic value of specific clinical variables^{6,7,8,9,10,11,14}. Accurate COVID-19 mortality prediction is essential because it will enable health management systems to allocate adequate and appropriate healthcare resources to the most critical COVID-19 cases. However, most of the previous studies on COVID-19 mortality prediction adopted an approach that involves feature selection, model training, and comparison of the performance of different ML algorithms. Studies that investigate the effect of the selective application of treatments, such as cross-validation (CV), principal component analysis (PCA) based dimensionality reduction, and the synthetic minority oversampling technique (SMOTE), or their combinations, on the performance of ML models regarding COVID-19 mortality prediction are rare. Thus, an understanding of the effect of these treatments on the performance of ML models during predictive modelling of COVID-19 mortality is still lacking. This knowledge gap requires to investigate the effect of PCA, CV and SMOTE on the performance of ML models when they are used for COVID-19 mortality prediction.

CV assesses how much a trained ML model can generalise on an independent dataset²⁶. It is a method designed to improve a predictive model's generalizability. PCA is a dimensionality reduction technique that enables large datasets to be transformed into one with reduced dimensions (fewer variables) that is easier to explore and can be processed more efficiently and faster by ML algorithms²⁶. Generally, applying PCA can lead to accuracy trade-offs, but it could still be worthwhile in many cases. Hence, it is an option for ML researchers when dealing with large datasets. Also, we went further in this study to interpret the mortality risk predictions using the shapley additive exPlanations (SHAP) model to determine the most critical variables with prognostic value regarding COVID-19 mortality. This is to provide a basis for improved decision-making by medical experts.

Summarily, this study has two objectives. The first is to perform predictive modelling of the mortality risk of COVID-19 patients, which could guide decision-making on their treatment by experimenting with some specific treatments (CV, PCA, and SMOTE). The second is identifying the most critical variables determining mortality risk in critically ill COVID-19 patients.

The rest of this paper is organised as follows. Section "Methodology" presents the methodology of the study. Section "Results" provides the experimental results obtained from the three selected ML models when different options of CV and PCA were applied. Section "Discussion" discusses the results, while the paper concludes in Section "Conclusions" with a summary and plan for future work.

Methodology

Study design and setting

The dataset used in this study pertains to a cohort of COVID-19 patients admitted into the ICU of Tygerberg Hospital in Cape Town, South Africa, during the first wave of COVID-19 from March to November 2020²⁷. Figure 1 shows the process workflow of the experimentation that we did. Firstly, data was collected from the hospital. The dataset was then split into training and testing sets. The training set was used to train the Deep MLP, XGBoost and a SVM. Later, we evaluated the performance of the ML models on the test set using standard classification metrics. We benchmarked the performance of the models using the F1-score, accuracy, recall, precision and AUC score. The Classification Report method of Scikit Learn framework was used to generate the F1-score, accuracy, recall, and precision of the ML models. The report also contains the Macro Average (the average score across all classes for precision, recall, and F1) and the Weighted Average (the mean value of a metric per class (e.g., F1, recall, precision) while considering the support of each class).

After this, we applied the SHAP model, a unified framework for interpreting predictions by ML models to investigate the model interpretability²⁸. SHAP is able to determine the level of the global importance of each feature to the prediction generated by a ML model. SHAP is a mathematical method based on game theory that can be used to explain any machine learning model's predictions by calculating each feature's contribution to the prediction²⁸. We then compared the important features extracted through SHAP for each ML model to derive our conclusion.

Predictor and outcome variables

The dataset consists of several aspects, such as:

Socio-demographic and lifestyle characteristics (e.g. age, gender, ethnicity, residential area, occupation, workplace, smoking status, alcohol use, drug use etc.).
Exposure history and symptoms (previous contact with infected persons or places; symptom types, severity score, etc.). This may not be available for critically ill patients.
Comorbidities (such as Hypertension, chronic obstructive pulmonary disease (COPD), diabetes, heart diseases, TB, HIV, as well as the discharged alive/death outcome)
Laboratory data such as full blood count (FBC), Ferritin, Procalcitonin (PCT), MicroRNAs, N-terminal pro–B-type natriuretic peptide (NT-proBNP) and inflammatory cytokine)
Medication data (e.g., Steroids, antibiotics, antiviral, antifungal, medication for glycaemic control, antihypertensive etc.)
Clinical monitoring – status (i.e., response to treatment) of the patient at the time of clinical examination, mechanical ventilation (e.g. Pulmonary Oxygen, Carbon Dioxide (CO2), Potassium, Calcium etc.)

The dataset consists of 154 features and 490 rows (records). There is one target variable–Discharged alive or Death. A snapshot of the dataset is presented in Tables 1 and 2. The dataset was imbalanced in terms of the target outcome (death or discharge) in the ratio of 60:35.

Table 1 Numeric/quantitative variables in the dataset.

Full size table

Table 2 Categorical/qualitative variables in the dataset.

Full size table

Data pre-processing

The data pre-processing techniques were applied to clean up the data and transform the data to numerical form (the numeric data points were scaled [0–1] by using the min–max normalisation function, while categorical variables were encoded using one-hot encoding). The dataset contained some missing values; therefore, the multivariate imputation technique was applied to fix missing values. Features (variables) that had 30% or more missing values and those considered redundant due to duplication were removed. A total of 55 variables were removed from the dataset that was used for analytics. These consist of 44 variables with more than 30% missing values and 11 variables regarded as redundant. Thus, after pre-processing, we were left with 99 variables and 490 rows used for our experimentation. The numeric variables in the dataset are shown in Table 1, while the categorical variables are presented in Table 2.

Principal component analysis (PCA)

PCA is a statistical process that converts the observations of correlated variables into a set of linearly uncorrelated variables with the help of orthogonal transformation, leading to a smaller version of the dataset with fewer variables considered significant(dimensionality reduction). These newly transformed (significant) features are called the Principal Components. Although reducing the number of variables in a dataset could impact the accuracy of results, the trade-off in easier visualisation of results and less computational complexity makes the training of machine learning models faster²⁹. Using PCA is particularly useful when dealing with a dataset with high dimensionality where the number of columns is more than the number of rows of data. The process of PCA involves:

I.
Ensuring that the ranges of continuous initial variables are standardised
II.
Identifying correlations by computing the covariance matrix
III.
Identify the principal components by determining the eigenvectors and eigenvalues of the covariance matrix
IV.
Creating a feature vector to decide which principal components to keep, and
V.
Recasting the data along the principal components’ axes.

Cross-validation

Cross-validation enables the training of ML models by using different subsets of the training dataset and evaluating them using a unique portion of that training dataset per time to enhance the generalisation of a trained ML model. Some of the more known techniques of cross-validation include³⁰:

I.
Leave One Out Cross-Validation (LOOCV) This entails leaving out one data point from the dataset while the rest is used to train the ML model for each learning set. The process is repeated for each data point so that k samples will have k training sets and k test sets. The process can help to minimise bias since all data points are used.
II.
K-fold cross-validation This entails dividing the training set into k-folds (k equal groups), where each group is called a fold. At any given time, k-1 folds are used to train the model while the reserved fold is used as the test set. The process is repeated over k iterations and ensures that a unique fold is used as the testing set for variant compositions of k-1 folds that are used for training. For example, assume that k = 10 on the first iteration, fold-2 to fold-10 are used for training, while fold-1 is used as the test set. On the second iteration, fold − 1 + fold − 3 + fold − 4 + ‚Ä¶ fold-10 is used for training, while fold-2 is used as the test set. The same procedure of k-1 folds for training and a unique (reserved) fold for testing is repeated for all k (10) iterations.
III.
Stratified K-Fold Cross-validation This method is a variation of the k-fold cross-validation designed for cases of imbalanced datasets. It involves selecting the value k and splitting the dataset into k-folds while ensuring that each fold contains approximately the same proportional representation of target classes as the complete data set. Then, selecting k-1 folds as the training set and the remaining fold as the test set;, and training the model with a particular k-1 folds training set per time. Thereafter, validating the model with the unique test set, and continuing the process for k iterations. In the end, obtaining the final average score of the model's performance.

Models training

This section presents theoretical perspectives on the three MLP Deep MLP, XGBoost, and SVM used in this study.

To overcome above-mentioned challenges, three ML algorithms, including deep multilayer perceptron (Deep MLP), extreme gradient boosting trees (XGBoost), and support vector machines (SVM) were selected for our predictive analytics. Although no particular ML algorithm works best for every problem or dataset, the selected methods rank among the best classification algorithms when dealing with tabular datasets based on evidence from the literature and ML competitions like Kaggle^31,32. For example, the multiple hidden layers in Deep ML allow it to model highly non-linear relationships in data. This is crucial for tasks where the relationship between inputs and outputs is complex and not easily captured by linear models^32,33. SVM has good generalisation performance and error bounds in real-life application problems, while XGBoost supports k-fold CV during model training, helping to assess the model's performance and reduce the risk of overfitting³². The selected algorithms more detailed as follow:

XGBoost

Boosting is an ensemble technique that combines a set of weak learners and aggregates their predictions to deliver improved prediction accuracy. Generally, gradient boosting represents an ensemble of machine learning models composed of decision trees to solve classification and prediction tasks. The Extreme Gradient Boosted Trees (XGBoost) algorithm works by growing a set of classification and regression trees (CART) sequentially to ensure the misclassification rate is reduced during subsequent iterations^32,33. Compared to general gradient boosting, XGBoost can prevent model over-fitting by incorporating two other features apart from the regularised loss function. Firstly, it uses the learning rate hyper-parameter to scale down the weights of each new tree to ensure that a single tree does not dominate leading to the final score. Also, it uses column sampling by using only column-wise data from the training data set to build each tree. Combining these three attributes makes XGBoost outperform traditional gradient boosting methods.

Deep MLP

The MLP is a model of an ANN. ANN emulates the human biological neural system to solve computational problems using adaptive learning. The nodes of an ANN called artificial neurons correspond to the human neuron. The MLP can be used to perform functional approximation to solve classification and regression problems.

Different variants of the backpropagation algorithm can be used to train the MLP network by exposing it to samples of input–output pairs. The difference between the network's output and the target output is fed back into the network to update the weights that connect the hidden-output layers and the input-hidden layers. The knowledge of the network is stored in the set of weights (w) that connect the neurons^34,35. The computation at each neuron is the linear combination of all inputs and weights that feed into it. A Deep MLP is a feedforward ANN with multiple (more than one) hidden layers fully connected in a dense architecture. Various activation functions such as rectifier (ReLU), sigmoid, and hyperbolic activation functions can be used in the Deep MLP.

SVM

The SVM is a supervised learning model that can perform classification and regression tasks and is capable of handling both linear and non-linear problems. It establishes a margin to determine data points that fall into specific classes. A typical SVM operation entails finding the best hyperplane (separating line) that separates the classes in a dataset into distinct classes. The hyperplanes represent decision boundaries that determine the class to which specific data points are classified. The support vectors are the data points from different classes closest to a hyperplane. The distance between the support vectors and the hyperplane is called the margin. The SVM algorithm aims to find the optimal hyperplane (decision boundary) with the maximum (longest) margin to enable a maximum number of points to be correctly classified in the training set. According to³⁴, the SVM is a function that correctly classifies two classes with a maximum margin. SVM works with two kernel parameters: the box constraint (c) and the curvature weight (Gamma). The c is a regularisation term set before training the model and used to control the penalty for misclassification. It is the trade-off between having a smooth decision boundary and being able to classify all data points in the training set correctly. The Gamma is a hyperparameter that determines the level of influence of a single training point.

Hyperparameters for training the ML models

The three ML models–DMLP, XGBoost, and SVM- were trained using optimal parameters generated by the grid search function in Scikit Learn. The optimal parameters that were automatically selected for training the models are shown in Table 3.

Table 3 Hyperparameters for training the ML models.

Full size table

At first, we considered four options in our experimentation with the three ML models (Deep MLP, SVM, and XGBoost):

i.
PCA + No cross-validation
ii.
PCA + cross-validation
iii.
No-PCA + No cross-validation
iv.
No-PCA + cross-validation

After determining the two options that produced the best performance, we then applied the synthetic minority oversampling technique (SMOTE) to these two options to investigate the model performance for a balanced dataset. Thus, for the three ML models, we considered the options:

xxii.
No-PCA + No cross-validation + SMOTE
xxiii.
No-PCA + cross-validation + SMOTE

Metrics for evaluation of the machine learning models

After training machine learning models, it is always essential to measure the performance of the trained model on the test data set. In this study, we used standard classification metrics such as precision, recall, F1-score, and the Area Under the Receiver Operating curve (AUC) score for each ML model selected for the study. These metrics are explained as follows:

i.
Precision This is the measure of positive predictions made that are truly positive. It is the ratio of the number of true positives (TP) by the sum of true positives (TP) and false positives (FP). $Precision\left(P\right)=\frac{TP}{TP}+FP$
ii.
Recall This measures how well the ML model can predict the true positive instances. It is the total number of true positives (TP) divided by the sum of TP and false negatives (FN). It is a measure of how well the most important occurrence is predicted. $Recall\left(R\right)=\frac{TP}{TP}+FN$
iii.
Accuracy the percentage of prediction that is correct. It is measured by dividing the number of correct predictions by the total number of predictions. It is not a good measure to assess a ML model when the dataset is imbalanced⁴.
$$Accuracy=Noofcorrectlyclassified\frac{samples}{total}numberofsamples$$
iv.
F1-score The harmonic mean of precision and recall gives a more balanced description of model performance. It is a value between 0 and 1. The F1 score is a suitable metric for assessing model performance for an imbalanced dataset. $F-Measure=2\frac{PR}{P}+R$. According to³⁶, FI score is interpreted as follows: F1 > 0.9 (very good); 0.8–0.9 (good); 0.5–0.8 (okay); < 0.5 (not good).
v.
AUC score The Area Under the Curve (AUC) measures how well a classifier can distinguish between classes and is used as a summary of the receiver operating curve (ROC). The AUC score is rated as follows: excellent (0.9–1), good (0.8–0.9), fair (0.7–0.8), poor (0.6–0.7), and failed (0.5–0.6).

Ethics approval and consent to participate

All methods used in this study followed relevant guidelines and regulations. Ethical approval and waiver of consent were obtained from the Health Research Ethics Committee of Stellenbosch University (Approval number: N20/04/002_COVID-19). The National Health Laboratory Service (NHLS) of South Africa also approved the use of their data for the study (Reference: SRSR3982441). The data analytics experimentation was approved by the Research Ethics Committee of the Faculty of Informatics and Design of the Cape Peninsula University of Technology (30/Daramola/2021).

Results

The results obtained from evaluating the ML models when the six options were implemented are presented next.

PCA without cross-validation

Table 4 shows the performance of the three models measured using precision, recall, F1-score, and accuracy when trained with data obtained after applying PCA on the original data without k-fold cross-validation. Figure 2 shows the AUC scores of the three models when trained under the same condition.

Table 4 Performance metrics when PCA was applied to training data without cross-validation.

Full size table

PCA with cross-validation

Table 5 shows the performance of the three ML models after PCA plus k-fold cross-validation was applied to the training data, while Fig. 3 shows the AUC scores.

Table 5 Performance metrics when both PCA and cross-validation were applied to training data.

Full size table

Non-PCA without cross-validation

Table 6 shows the performance of the three models when trained with the original data without using PCA or k-fold cross-validation, while Fig. 4 shows the AUC scores of the three models for the same scenario.

Table 6 Performance metrics when none of PCA and cross-validation was applied for training.

Full size table

Non-PCA with cross-validation

Table 7 shows the performance of the three models measured using precision, recall, F1-score, and accuracy when trained without applying PCA, but k-fold cross-validation was used. Figure 5 also shows the corresponding AUC scores of the three models.

Table 7 Performance metrics when ML models were trained without PCA but with cross-validation.

Full size table

Effect of applying the synthetic minority oversampling technique (SMOTE)

The results obtained from our experiments on the initial four options showed that the ML models generally performed better when PCA was not applied. Since the dataset was imbalanced in terms of the target value in the ratio of 60:35, we applied SMOTE to increase instances of the minority class. This resulted in a training dataset of 488 records with an even distribution (50:50) of the target variable. We then compared instances of combining cross-validation plus SMOTE without PCA and when cross-validation and SMOTE without PCA were used. Tables 8 and 9 show the results obtained in these instances, while the corresponding AUC scores are shown in Figs. 6 and 7.

Table 8 Performance metrics when SMOTE and cross-validation without PCA were applied.

Full size table

Table 9 Performance metrics when SMOTE, without cross-validation and PCA, were applied.

Full size table

The summary overview of the different experiments is shown in Table 10.

Table 10 Summary of Results after experiments.

Full size table

The results in Table 10 show that in terms of F1 score and AUC score, the Deep MLP had a generally good performance when cross-validation (CV) without PCA was applied but had the best overall performance when CV and SMOTE were applied without PCA (F1 = 0.92; AUC = 0.94). The SVM (F1 = 0.83; AUC = 0.82) had the second-best performance when CV and SMOTE were applied without PCA. Thus, we found that, generally, the performance of both SVM and Deep MLP can be enhanced through CV, and they also perform very well without PCA. The XGBoost model (F1 = 081; AUC = 0.79) had the best performance of the three models when we used the raw data without applying any of CV, PCA or SMOTE. XGBoost also performed well with SMOTE, but it is not affected by CV and performed worse in all cases when PCA was applied.

Analysis of feature importance generated by ML models based on SHAP values

Based on the result of the experiments, we extracted instances where each of the three ML models had their best performance and examined the features that had the most influence on their prediction (feature importance) based on SHAP values. These are shown in Figs. 8, 9 and 10.

A comparison of the best-performing instances of the trained models is highlighted in Table 11, while the frequency of occurrence of specific features across the three ML models is shown in Table 12.

Table 11 Feature importance associated with predictions of ML models.

Full size table

Table 12 Frequency of occurrence of important features by ML Models.

Full size table

From the results in Table 12, the six most critical variables for predicting the mortality or survival of COVID-19 patients were: Length of stay in the hospital, Duration in ICU, Time to ICU from Admission, Days discharged alive or death, D-dimer, and Blood pH. In addition, we also found that other features such as Age at admission, PaO2/FiO2 ratio (Pf_ratio), TropT, Ferritin, ventilation, C-reactive protein (CRP), and Symptom of Acute respiratory distress syndrome (ARDS) (Symp_Acute_Res) are also critical determinants of outcome in terms of survival or death of a patient.

Discussion

We will discuss the results of our experiments from two perspectives that align with the goals of our study.

Lessons from applying PCA and cross-validation

We noticed that applying PCA in combination with cross-validation has not been commonly used for predicting the mortality risk of COVID-19. Most previous approaches have focused on applying feature selection techniques, such as the Pearson correlation³⁷, Chi-square³⁸, and recursive feature elimination²², to identify the relevant predictors to use for prediction without considering applying PCA. Table 13 shows an overview of recent papers on mortality prediction of COVID-19. Our observation is that although our accuracy, F1-score, and AUC results compare favourably with those of previous studies, the instances where the effect of application of PCA + cross-validation + SMOTE are not yet prevalent. However, the results of our research and those of others, as shown in Table 13, emphasise the potency of machine learning as a viable tool to aid decision-making on handling of COVID-19 cases and other diseases whenever a quality dataset is available.

Table 13 Overview of previous studies on mortality risk prediction using ML Models.

Full size table

Based on the results of our study, we learnt the following about the three selected ML models:

i.
The performance of deep MLP and SVM

Cross-validation can enhance the model performance of Deep MLP and SVM if the proper training parameters are selected. The Deep MLP had a generally good performance in terms of F1-score and AUC when cross-validation was applied (F1 = 0.92 (very good); AUC = 0.94 (excellent)). This observation was consistent irrespective of whether SMOTE was applied or not (see Table 10). The SVM also showed the same pattern because it had a good F1-score (0.83) and a good AUC score (0.82) with or without SMOTE when cross-validation was applied. These observations align with the postulations in the literature that model training can be generally enhanced through cross-validation to prevent overfitting and improve model generalisation on unseen data^26,30. Thus, cross-validation is recommended when using Deep MLP and SVM.

ii.
The effect of PCA on the performance of ML models

We found a reduction in model performance when PCA was applied to the three ML models (Deep MLP, SVM, XGBoost). However, the Deep MLP performed well when PCA plus cross-validation was applied. Considering that we did not have a large data set, the drop in model performance when PCA was applied compared to when it was not was quite significant (see Table 9). This observation suggests that the choice of whether to apply PCA on a tabular dataset should be weighed carefully depending on the choice of ML model for predictive analytics. Based on our observation, PCA plus cross-validation could be a good choice when using deep learning models. Our findings in this study also support this notion. However, for the other ML models, there may be better results in accuracy. Our perspective on this, which is reinforced by the result of our experimentation, aligns with an established school of thought that the application of PCA can lead to loss of accuracy but has benefits in terms of easier model visualisation and faster training, which could be a desirable trade-off for model accuracy in certain types of applications^26,30. For the medical domain, where model prediction accuracy is paramount, applying PCA on a tabular medical dataset may not be the first option. Instead, feature selection based on correlations between the independent variable and the target variable to work with a reduced number of variables, particularly for large data sets, is a better option to achieve better model accuracy. In all cases, the right choice that suits a specific dataset in terms of either applying PCA, feature selection through correlation, or using the whole feature set should be investigated through experimentation.

iii.
Performance of XGBoost

XGBoost is one of the most powerful ML algorithms, particularly when dealing with tabular (non-image) data. It is acclaimed for being successful in several Kaggle competitions. Based on our experimentation, we found that XGBoost (F1 = 0.81 (good); AUC = 0.79 (fair)) performed best when used with the raw dataset without any PCA, cross-validation or SMOTE when compared with the other two ML models (Deep MLP, SVM). This observation aligns with the acknowledged characteristics of XGBoost as profiled in the literature^38,39. According to³⁹, XGBoost has inbuilt regularisation, which avoids data overfitting problems. It also has an internal function for cross-validation and is well-equipped to deal with missing values. These features make XGBoost capable of dealing with raw data without needing external treatments that other ML algorithms require. Our experiment shows that an attempt to apply cross-validation externally to XGBoost leads to lower performance. Also, because it is sparse-aware, SMOTE yielded no significant performance improvement, while PCA caused a reduced performance.

Lessons from COVID-19 mortality prediction

From our study, we found that the six most critical variables for the prediction of mortality or survival of the COVID-19 patients admitted into the ICU (see Table 12) are (i) length of stay in the hospital; (ii) the Length of time spent in the ICU (Duration in ICU); (iii) Time to ICU from Admission; (iv) the number of days spent in the hospital before discharge or death; (v) the blood clotting factor (D-dimer), and (vi) the blood pH of the patient.

The first four variables collectively indicate the severity of the patient's illness level at the time of admission and its effect on the outcome. In comparison, the last two variables indicate the comorbidity condition of a patient. D-dimer, which is a measure of the blood clotting factor of a patient, could be indicative of the existence or absence of a comorbidity. Studies have shown that D-dimer has prognostic value in detecting severity and fatality associated with COVID-19 cases^40,41,42,43. According to⁴⁰, D-dimer can predict severe and fatal cases of COVID-19 with moderate accuracy. In⁴⁴, it was reported that elevated D-dimers (> 2590 ng mL⁻¹) and the absence of anticoagulant therapy could predict pulmonary embolism (PE) in hospitalised COVID-19 patients with severe infections.

Similarly⁴⁵, established a relationship between blood pH and fatal outcomes in critically ill COVID-19 patients at an intensive care unit. This also confirms the prognostic value of blood pH for COVID-19, which our ML models also detected. In addition⁴⁵, found that Blood pH value, mean arterial pressure, base excess, troponin, and procalcitonin were highly significant prognostic factors of in-hospital mortality.

Apart from D-dimers and Blood pH, other factors include age at admission, PaO2/FiO2 ratio (Pf_Ratio), TropT, Ferritin, ventilation, CRP, and Symptom of Acute respiratory distress syndrome (ARDS) also have a strong association with the severity and fatality of COVID-19 cases. This observation is supported by studies that have shown that the most frequently reported biological anomalies in COVID-19 patients include elevations of inflammatory markers such as C-reactive protein, D-dimers, Ferritin and interleukin-6^45,46,47. In addition, these findings also significantly complement the results of three studies that were previously conducted on the dataset of the same cohort in South Africa, as reported in^48,49,50. In⁴⁸, the authors found that raised neutrophil count and neutrophil/lymphocyte ratio (NLR) were associated with a worse outcome. At the same time, age, female sex, and D-dimer levels were independent risk factors associated with mortality risk. The results of⁴⁹ found seventeen categorical variables were associated with mortality among patients admitted to the ICU. These are: age, intubation, HIV positive status, arterial blood gas results, lactate, PF ratio, urea, neutrophil count, C-reactive protein (CRP), Procalcitonin (PCT), D-Dimer, NT-proBNP, troponin T, HbA1c, magnesium, aspartate transaminase (AST), and alkaline phosphatase (ALP). Lastly, in⁵⁰, the result showed that age (above 48 years), requiring intubation, HIV status, procalcitonin (PCT), Troponin T, Aspartate Aminotransferase (AST), and a low pH on admission were significant predictors of mortality.

Incidences of abnormal indicators of D-dimer, Blood pH, C-reactive protein (CRP), Ferritin, and TropT are usually associated with patients with comorbidities such as Diabetes, obesity, coronary artery disease, hypertension, atrial fibrillation, and chronic heart failure (CHF), cancer, previous angina, peripheral arterial disease, previous myocardial infarction (MI), chronic kidney disease (CKD), cerebrovascular disease, and chronic obstructive pulmonary disease (COPD), HIV and many more^51,52. Thus, our findings have identified some prognostic factors that strongly relate to comorbidities and the prediction of severity and outcomes of critically ill COVID-19 patients.

Study limitations

This study used a dataset of critically ill patients under intensive care in a public South African hospital. A limitation of this study was the limited size of the dataset. It was occasioned by the fact that South Africa had the highest number of intensive care admissions during the first wave of COVID-19 and, to a lesser extent, during the second wave, when vaccines were not available, and there was generally limited knowledge of COVID-19 preventive care strategies. After the first and second waves of COVID-19, ICU admissions in most hospitals dwindled nationwide (South Africa), which resulted in reduced access to data on intensive care patients. For this reason, the findings of this study have limited generalizability in that the performance of these ML models may vary when applied to different datasets or healthcare settings. Still, we consider our study’s findings profound and valuable for scholarship because they align with the findings from several clinical studies on the prognosis of COVID-19 mortality^40,41,42,43. Also, very few case studies pertaining to COVID-19 patients under intensive care from the African context have been reported in the literature.

Conclusions

In this study, we investigated mortality risk prediction using a dataset of COVID-19 patients admitted into a South African hospital's ICU during the first wave of COVID-19. We found that the six most critical variables associated with predicting the severity of COVID-19 in the patients were Length of stay in the hospital, Duration in ICU, Time to ICU from Admission, Days discharged alive or dead, D-dimer, and Blood pH. In addition, we also found that other features such as Age at admission, PaO2/FiO2 (the ratio of arterial oxygen partial pressure (PaO2 in mmHg) to fractional inspired oxygen (FiO2)), TropT, Ferritin, ventilation, C-reactive protein (CRP), and Symptom of Acute respiratory distress syndrome (ARDS). To do this, we compared the performance of three machine learning models, namely Extreme Gradient Boosting Trees (XGBoost), Support Vector Machines (SVM) and Deep Multi-Layer Perceptron (Deep MLP) when specific treatments: PCA, cross-validation and SMOTE were applied during the model training.

This study makes a theoretical contribution by providing insight into the effect of the combination of CV, PCA, and SMOTE on the performance and interpretability of MLP, XGBoost, and SVM, which has not been previously explored to the best of our knowledge. Also, the study demonstrates how predictive modelling can be applied to identify variables with prognostic value for predicting the severity and outcome of critically ill COVID-19 patients from a study setting in South Africa, which is not prevalent in the literature.

This study has practical implications. Firstly, it reveals how machine learning could be used to identify the critical variables with prognostic value for predicting the severity and mortality of critically ill COVID-19 patients with comorbidities. It can aid healthcare service delivery in Africa, where medical decision-support mechanisms that can lead to increased productivity and efficiency of medical practice are not common. Secondly, the study offers an intellectual guide to help machine learning practitioners make informed decisions during popular ML models like Deep MLP, SVM, and XGBoost. It provides practical insights on how cross-validation, PCA, and SMOTE could be applied during model training and the likely expected outcomes.

Since AI relies on complex mathematics and statistics, it is essential to ensure that the model produces accurate and clinically relevant results. Decision-tracing practices should be developed based on the computations behind the model for user implementation. In future work, we will look into the creation of explicit inference rules and the development of a decision support tool (web app) that can guide medical practitioners when dealing with critical cases of COVID-19. Also, we will explore the adaptation of such an approach to other infectious diseases.

Data availability

The code implementation can be shared on request. Peter Nyasulu (nyasulu@sun.ac.za) can be contacted for the data.

References

Zoabi, Y., Deri-Rozov, S. & Shomron, N. Machine learning-based prediction of COVID-19 diagnosis based on symptoms. NPJ Digit. Med. 4(1), 3 (2021).
Article PubMed PubMed Central MATH Google Scholar
Alazab, M., Awajan, A., Mesleh, A., Abraham, A., Jatana, V., & Alhyari, S. 2020. COVID-19 prediction and detection using deep learning. http://www.softcomputing.net/ijcisim_1.pdf. Accessed 13 Jan 2023.
Allwood, B. W. et al. Predicting COVID-19 outcomes from clinical and laboratory parameters in an intensive care facility during the second wave of the pandemic in South Africa. IJID Reg. 3, 242–247 (2022).
Article PubMed PubMed Central MATH Google Scholar
Muhammad, L. J. et al. Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset. SN Comput. Sci. 2(1), 11 (2021).
Article CAS PubMed MATH Google Scholar
Alakus, T. B. & Turkoglu, I. Comparison of deep learning approaches to predict COVID-19 infection. Chaos Solitons Fractals 140(110120), 110120 (2020).
Article MathSciNet PubMed PubMed Central MATH Google Scholar
Chowdhury, M. E. H. et al. An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cognit. Comput. https://doi.org/10.1007/s12559-020-09812-7 (2021).
Article PubMed PubMed Central MATH Google Scholar
Domínguez-Olmedo, J. L., Gragera-Martínez, Á., Mata, J. & Pachón Álvarez, V. Machine learning applied to clinical laboratory data in Spain for COVID-19 outcome prediction: Model development and validation. J. Med. Internet Res. 23(4), e26211 (2021).
Article PubMed PubMed Central Google Scholar
Aljameel, S. S., Khan, I. U., Aslam, N., Aljabri, M. & Alsulmi, E. S. Machine learning-based model to predict the disease severity and outcome in COVID-19 patients. Sci. Program. 2021, 1–10 (2021).
Google Scholar
Sun, C., Hong, S., Song, M., Li, H. & Wang, Z. Predicting COVID-19 disease progression and patient outcomes based on temporal deep learning. BMC Med. Inform. Decis. Mak. 21(1), 45 (2021).
Article PubMed PubMed Central MATH Google Scholar
An, C. et al. Machine learning prediction for mortality of patients diagnosed with COVID-19: A nationwide Korean cohort study. Sci. Rep. 10(1), 18716 (2020).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Pourhomayoun, M. & Shakibi, M. Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making. Smart Health 20(100178), 100178 (2021).
Article PubMed MATH Google Scholar
Subudhi, S. et al. Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19. NPJ Digit. Med. 4(1), 87 (2021).
Article PubMed PubMed Central MATH Google Scholar
Shahid, F., Zameer, A. & Muneeb, M. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. Chaos Solitons Fractals 140(110212), 110212 (2020).
Article MathSciNet PubMed PubMed Central Google Scholar
Assaf, D. et al. Utilization of machine-learning models to accurately predict the risk for critical COVID-19. Intern. Emerg. Med. 15(8), 1435–1443 (2020).
Article PubMed PubMed Central MATH Google Scholar
Davies, M. -A. HIV and risk of COVID-19 death: A population cohort study from the Western Cape Province, South Africa. medRxiv (2020).
Daramola, O. et al. Towards AI-enabled multimodal diagnostics and management of COVID-19 and comorbidities in resource-limited settings. Informatics (MDPI) 8(4), 63 (2021).
Article Google Scholar
Rustam, F. et al. COVID-19 future forecasting using supervised machine learning models. IEEE Access 8, 101489–101499 (2020).
Article MATH Google Scholar
Sarumi, O. A., Aouedi, O. & Muhammad, L. J. Potential of Deep Learning Algorithms in Mitigating the Spread of COVID-19. In Understanding COVID-19: The Role of Computational Intelligence Vol. 963 (eds Nayak, J. et al.) 225–244 (Springer International Publishing, Cham, 2022).
Chapter Google Scholar
Nemati, M., Ansary, J. & Nemati, N. Machine-learning approaches in COVID-19 survival analysis and discharge-time likelihood prediction using clinical data. Patterns 1(5), 100074 (2020).
Article CAS PubMed PubMed Central MATH Google Scholar
Das, A. K., Mishra, S. & Saraswathy Gopalan, S. Predicting CoVID-19 community mortality risk using machine learning and development of an online prognostic tool. PeerJ 8, e10083 (2020).
Article PubMed PubMed Central MATH Google Scholar
Zakariaee, S. S., Naderi, N., Ebrahimi, M. & Kazemi-Arpanahi, H. Comparing machine learning algorithms to predict COVID-19 mortality using a dataset including chest computed tomography severity score data. Sci. Rep. 13(1), 11343 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Khadem, H., Nemat, H., Eissa, M. R., Elliott, J. & Benaissa, M. COVID-19 mortality risk assessments for individuals with and without diabetes mellitus: Machine learning models integrated with interpretation framework. Comput. Biol. Med. 144, 105361 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mahanty, C., Kumar, R., Asteris, P. G. & Gandomi, A. H. COVID-19 patient detection based on fusion of transfer learning and fuzzy ensemble models using CXR images. Appl. Sci. 11(23), 11423 (2021).
Article CAS MATH Google Scholar
Mahanty, C., Kumar, R. & Patro, S. G. K. Internet of medical things-based COVID-19 detection in CT images fused with fuzzy ensemble and transfer learning models. N. Gener. Comput. 40(4), 1125–1141 (2022).
Article MATH Google Scholar
Abd Elaziz, M. et al. Boosting COVID-19 image classification using MobileNetV3 and aquila optimizer algorithm. Entropy 23(11), 1383 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Brownlee, J. Machine Learning Mastery with Python: Understand Your Data, Create Accurate Models, and Work Projects End-to-End (Machine Learning Mastery, 2016).
MATH Google Scholar
Allwood, B. W. et al. Clinical evolution, management and outcomes of patients with COVID-19 admitted at Tygerberg Hospital, Cape Town, South Africa: A research protocol. BMJ Open 10(8), e039455 (2020).
Article PubMed PubMed Central MATH Google Scholar
Lundberg, S. & Lee, S. -I. A unified approach to interpreting model predictions. arXiv [cs.AI] (2017).
Douglass, M. J. J. Book review: Hands-on machine learning with scikit-learn, keras, and tensorflow, 2nd edition by Aurélien géron. Phys. Eng. Sci. Med. 43(3), 1135–1136 (2020).
Lyashenko, V. & Jha, A. Cross-Validation in Machine Learning: How to Do It Right (Neptune Blog, 2021).
MATH Google Scholar
Olson, R. S., Cava, W. L., Mustahsan, Z., Varik, A. & Moore, J. H. Data-Driven Advice for Applying Machine Learning to Bioinformatics Problems. in Biocomputing 2018 (Kohala Coast, 2018).
Chen, T. & Guestrin, C. XGBoost, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA, (2016).
Luckner, M., Topolski, B. & Mazurek, M. Application of XGBoost Algorithm in Fingerprinting Localisation Task. In Computer Information Systems and Industrial Management Vol. 10244 (eds Saeed, K. et al.) 661–671 (Springer, 2017).
Chapter MATH Google Scholar
Awad, M. & Khanna, R. 2015. Efficient learning machines: Theories, concepts, and applications for engineers and system designers. Available: https://library.oapen.org/bitstream/handle/20.500.12657/28170/1001824.pdf?sequen. Accessed 13 Jan 2023.
Daramola, O., Ajala, I. & Akinyemi, I. An experimental comparison of three machine learning techniques for web cost estimation. Int. J. Softw. Eng. Appl. 10(2), 191–206 (2016).
MATH Google Scholar
Allwright, S. What is a good F1 score and how do I interpret it? (2022).
Moulaei, K., Shanbehzadeh, M., Mohammadi-Taghiabad, Z. & Kazemi-Arpanahi, H. Comparing machine learning algorithms for predicting COVID-19 mortality. BMC Med. Inform. Decis. Mak. 22(1), 1–12 (2022).
Article Google Scholar
Reinstein, I. 2017. Xgboost, a top machine learning method on kaggle, explained. Disponível em: https://www.kdnuggets.com/2017/10/xgboost-top-machine-learningmethod-kaggle-explained.html. Accessed 01 Aug 2018
Dhaliwal, S., Nahid, A.-A. & Abbas, R. Effective intrusion detection system using XGBoost. Information 9(7), 149 (2018).
Article Google Scholar
Zhan, H. et al. Diagnostic value of D-dimer in COVID-19: A meta-analysis and meta-regression. Clin. Appl. Thromb. Hemost. 27, 10760296211010976 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Zhao, R. et al. Associations of D-dimer on admission and clinical features of COVID-19 patients: A systematic review, meta-analysis, and meta-regression. Front. Immunol. 12, 691249 (2021).
Article CAS PubMed PubMed Central Google Scholar
Varikasuvu, S. R. et al. D-dimer, disease severity, and deaths (3D-study) in patients with COVID-19: A systematic review and meta-analysis of 100 studies. Sci. Rep. 11(1), 21888 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Mouhat, B. et al. Elevated D-dimers and lack of anticoagulation predict PE in severe COVID-19 patients. Eur. Respir. J. 56(4), 2001811 (2020).
Article CAS PubMed PubMed Central Google Scholar
Sartini, S. et al. Role of SatO2, PaO2/FiO2 ratio and PaO2 to predict adverse outcome in COVID-19: A retrospective, cohort study. Int. J. Environ. Res. Public Health 18(21), 11534 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Kieninger, M. et al. Lower blood pH as a strong prognostic factor for fatal outcomes in critically ill COVID-19 patients at an intensive care unit: A multivariable analysis. PLoS One 16(9), e0258018 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Wang, D. et al. Clinical characteristics of 138 hospitalized patients with 2019 novel Coronavirus-infected pneumonia in Wuhan, China. JAMA 323(11), 1061–1069 (2020).
Article CAS PubMed PubMed Central MATH Google Scholar
Zhou, F. et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study. Lancet 395(10229), 1054–1062 (2020).
Article CAS PubMed PubMed Central MATH Google Scholar
Chapanduka, Z.C., Abdullah, I., Allwood, B., Koegelenberg, C.F., Irusen, E., Lalla, U., Zemlin, A.E., Masha, T.E., Erasmus, R.T., Jalavu, T.P. & Ngah, V.D. Haematological predictors of poor outcome among COVID-19 patients admitted to an intensive care unit of a tertiary hospital in South Africa. PloS one 17(11), p.e0275832. (2022).
Nyasulu, P.S. et al. Clinical characteristics associated with mortality of COVID-19 patients admitted to an intensive care unit of a tertiary hospital in South Africa. PloS one 17(12), p.e0279565. (2022)
Chimbunde, E., Sigwadhi, L. N., Tamuzi, J. L., Okango, E. L., Daramola, O., Ngah, V. D., & Nyasulu, P. S. Machine learning algorithms for predicting determinants of COVID-19 mortality in South Africa. Frontiers in Artificial Intelligence 6, 1171256. (2023).
Sundaram, V. et al. Impact of comorbidities on peak troponin levels and mortality in acute myocardial infarction. Heart 106(9), 677–685 (2020).
Article CAS PubMed MATH Google Scholar
Brojakowska, A. et al. Comorbidities, sequelae, blood biomarkers and their associated clinical outcomes in the Mount Sinai Health System COVID-19 patients. PLoS One 16(7), e0253660 (2021).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors acknowledge the support of the management authorities at Cape Peninsula University of Technology (CPUT), and Stellenbosch University (SU) for creating an environment that enabled the execution of this study.

Funding

The research reported in this article was supported by South African Medical Research Council with funds received from National Treasury (under the project titled “Data Analytics for Clinical Guidance on Immune Response to COVID19 Vaccination for Patients with Comorbidities and Prior Infection”).

Author information

Authors and Affiliations

Department of Information Technology, Cape Peninsula University of Technology, Cape Town, South Africa
Olawande Daramola, Tatenda Duncan Kavu & Boniface Kabaso
Division of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch University, Cape Town, South Africa
Isaac Fwemba, Fisayo Daramola, Martha Nyirenda & Peter S. Nyasulu
Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
Maritha J. Kotze & Susan J. van Rensburg
Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
Maritha J. Kotze
Applied Microbial and Health Biotechnology Institute, Cape Peninsula University of Technology, Cape Town, South Africa
Jeanine L. Marnewick
St Pölten University of Applied Sciences, St Pölten, Austria
Thomas Moser
School of Health Information Science, University of Victoria, Victoria, BC, Canada
Karl Stroetmann
Department of Mathematics and Computer Science, Philipps University of Marburg, Hans-Meerwein Str. 6 D-35032, Marburg, Germany
Oluwafemi A. Sarumi

Authors

Olawande Daramola
View author publications
You can also search for this author inPubMed Google Scholar
Tatenda Duncan Kavu
View author publications
You can also search for this author inPubMed Google Scholar
Maritha J. Kotze
View author publications
You can also search for this author inPubMed Google Scholar
Jeanine L. Marnewick
View author publications
You can also search for this author inPubMed Google Scholar
Oluwafemi A. Sarumi
View author publications
You can also search for this author inPubMed Google Scholar
Boniface Kabaso
View author publications
You can also search for this author inPubMed Google Scholar
Thomas Moser
View author publications
You can also search for this author inPubMed Google Scholar
Karl Stroetmann
View author publications
You can also search for this author inPubMed Google Scholar
Isaac Fwemba
View author publications
You can also search for this author inPubMed Google Scholar
Fisayo Daramola
View author publications
You can also search for this author inPubMed Google Scholar
Martha Nyirenda
View author publications
You can also search for this author inPubMed Google Scholar
Susan J. van Rensburg
View author publications
You can also search for this author inPubMed Google Scholar
Peter S. Nyasulu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

O.D., M.J.K and P.N., conceptualized the idea. O.D. was responsible for the research design. O.D, T.K. performed the code implementation, and experiments, and wrote the first draft of the paper. P.N., F.D., M.N collected the data. O.S drafted the literature review. All authors (O.D., T. K., M.J.K., J.V.M., O.S., B.K., T.M., K.S., I.F., F. D, M.N., S.J.R.,P.N.) provided a critical review of the manuscript and approved the final draft for publication. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Olawande Daramola.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Daramola, O., Kavu, T.D., Kotze, M.J. et al. Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients. Sci Rep 15, 2184 (2025). https://doi.org/10.1038/s41598-023-46712-w

Download citation

Received: 18 February 2023
Accepted: 03 November 2023
Published: 16 January 2025
DOI: https://doi.org/10.1038/s41598-023-46712-w

Subjects

Abstract

Similar content being viewed by others

Machine learning based prediction models for the prognosis of COVID-19 patients with DKA

Multivariable mortality risk prediction using machine learning for COVID-19 patients at admission (AICOVID)

Machine learning-based prediction of in-ICU mortality in pneumonia patients

Introduction

Methodology

Study design and setting

Predictor and outcome variables

Data pre-processing

Principal component analysis (PCA)

Cross-validation

Models training

XGBoost

Deep MLP

SVM

Hyperparameters for training the ML models

Metrics for evaluation of the machine learning models

Ethics approval and consent to participate

Results

PCA without cross-validation

PCA with cross-validation

Non-PCA without cross-validation

Non-PCA with cross-validation

Effect of applying the synthetic minority oversampling technique (SMOTE)

Analysis of feature importance generated by ML models based on SHAP values

Discussion

Lessons from applying PCA and cross-validation

Lessons from COVID-19 mortality prediction

Study limitations

Conclusions

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links