Next Article in Journal
The Role of Tumor Necrosis Factor Alpha Antagonists (Anti TNF-α) in Personalized Treatment of Patients with Isolated Polymyalgia Rheumatica (PMR): Past and Possible Future Scenarios
Next Article in Special Issue
Nucleic Acid-Based COVID-19 Therapy Targeting Cytokine Storms: Strategies to Quell the Storm
Previous Article in Journal
Cancer Angiogenesis and Opportunity of Influence on Tumor by Changing Vascularization
Previous Article in Special Issue
Sleep Dysfunction in COVID-19 Patients: Prevalence, Risk Factors, Mechanisms, and Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Machine Learning Approach to Predict the Rehabilitation Outcome in Convalescent COVID-19 Patients

1
Istituti Clinici Scientifici Maugeri IRCCS, Bioengineering Unit of Telese Terme Institute, 82037 Telese Terme, Italy
2
Department of Information Technology and Electrical Engineering, University of Naples “Federico II”, 80125 Naples, Italy
3
Istituti Clinici Scientifici Maugeri IRCCS, Cardiac Rehabilitation Unit of Telese Terme Institute, 82037 Telese Terme, Italy
4
Istituti Clinici Scientifici Maugeri IRCCS, Pulmonary Rehabilitation Unit of Telese Terme Institute, 82037 Telese Terme, Italy
*
Author to whom correspondence should be addressed.
The two Authors equally contributed to the manuscript and have been listed in alphabetic order.
The two Authors share co-seniorship and have been listed in alphabetic order.
J. Pers. Med. 2022, 12(3), 328; https://doi.org/10.3390/jpm12030328
Submission received: 3 February 2022 / Revised: 18 February 2022 / Accepted: 18 February 2022 / Published: 22 February 2022
(This article belongs to the Special Issue Personalized Medicine for Covid-19 Patients-Clinical Considerations)

Abstract

:
Background: After the acute disease, convalescent coronavirus disease 2019 (COVID-19) patients may experience several persistent manifestations that require multidisciplinary pulmonary rehabilitation (PR). By using a machine learning (ML) approach, we aimed to evaluate the clinical characteristics predicting the effectiveness of PR, expressed by an improved performance at the 6-min walking test (6MWT). Methods: Convalescent COVID-19 patients referring to a Pulmonary Rehabilitation Unit were consecutively screened. The 6MWT performance was partitioned into three classes, corresponding to different degrees of improvement (low, medium, and high) following PR. A multiclass supervised classification learning was performed with random forest (RF), adaptive boosting (ADA-B), and gradient boosting (GB), as well as tree-based and k-nearest neighbors (KNN) as instance-based algorithms. Results: To train and validate our model, we included 189 convalescent COVID-19 patients (74.1% males, mean age 59.7 years). RF obtained the best results in terms of accuracy (83.7%), sensitivity (84.0%), and area under the ROC curve (94.5%), while ADA-B reached the highest specificity (92.7%). Conclusions: Our model enables a good performance in predicting the rehabilitation outcome in convalescent COVID-19 patients.

1. Introduction

The coronavirus disease 2019 (COVID-19) is a syndrome with a number of clinical manifestations, ranging from mild symptoms to severe complications necessitating intensive care unit (ICU) admittance [1]. After the acute disease, convalescent COVID-19 patients may experience several persistent symptoms, such as fatigue and muscular weakness [2], with a residual pulmonary impairment potentially lasting for months after a negative swab test [3]. Overall, given the high proportion of patients with such persistent manifestations, the new paradigm of a “post-acute COVID-19 syndrome” has been introduced [3]. Thus, the need for an early and multidisciplinary rehabilitation has been proposed [4,5,6,7]. Unfortunately, the information on the effectiveness of this approach in the post-acute care setting is still to be determined, given the absence of a general consensus on the rehabilitation programs and the lack of adequate prediction tools [8].
Among the functional outcome measures of pulmonary rehabilitation (PR), the 6-min walking test (6MWT) is widely accepted as an accurate and cost-effective method [9]. 6MWT is commonly used to measure physical activity and exercise capacity, correlating with both peak oxygen consumption and handgrip strength [10].
In the last years, the machine learning (ML) approach has been increasingly used, allowing researchers to implement algorithms that analyze datasets in order to predict the onset of a disease. Moreover, ML algorithms have been successfully used to predict rehabilitation outcomes in neurology [11], orthopaedics [12], and cardiology [13]. Recently, ML has also been used to find hidden patterns among patients affected by COVID-19, employing features extracted from X-ray and computed tomography (CT) images with good results [14].
While ML has been extensively employed as a means of triaging COVID-19 patients during the acute phase [15], no studies have used this approach to classify the factors influencing the rehabilitative outcome in the post-acute care setting.
Using the clinical characteristics of convalescent COVID-19 patients hospitalized for PR, the aim of our study was to develop a model predicting the effectiveness of multidisciplinary rehabilitation in terms of improved performance at the 6MWT.

2. Materials and Methods

2.1. Study Population

Convalescent COVID-19 patients referring to the Pulmonary Rehabilitation Unit of Istituti Clinici Scientifici Maugeri Spa SB, IRCCS of Telese Terme, Benevento, Italy, were consecutively evaluated for enrolment. Inclusion criteria were recent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, as confirmed by reverse transcription polymerase chain reaction (RT-PCR); severe-to-critical COVID-19 according to World Health Organization (WHO); negative nasopharyngeal swab for SARS-CoV-2 in the past 2 months; and indication for in-hospital PR due to persistent clinical manifestations of COVID-19 after the acute phase. Exclusion criteria were age < 18 years and inability to understand the informed consent or poor compliance with the study procedures in the investigator’s opinion. Patients with missing data for the outcome of interest were excluded from the study.
Whenever appropriate and applicable, this study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines [16]. The protocol was approved by the Institutional Review Board of “Istituto Nazionale Tumori, Fondazione Pascale”, Naples, Italy, with reference number ICS 11/20, and all patients provided written informed consent to use their de-identified data.

2.2. Data Collection and Analysis

After informed consent signature, the main demographic and clinical characteristics were collected in all included patients. All study procedures were performed at baseline and after conclusion of the PR program.
A blood gas analyzer (ABL 825® FLEX BGA, Radiometer Medical Aps, Copenhagen, Denmark) was used to measure arterial oxygen (PaO2) and carbon dioxide tension (PaCO2). Spirometry parameters, lung volumes, and diffusion capacity for carbon monoxide (DLCO) were measured by using automated equipment (Vmax® Encore, Vyasis Healthcare, Milan, Italy) according to American Thoracic Society/European Respiratory Society (ATS/ERS) guidelines [17,18]. Forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC), and DLCO were expressed both in liters (L) and percent of predicted values (FEV1%, FVC%, and DLCO%, respectively).
The Barthel score and the COPD Assessment Test (CAT) were calculated to determine the level of functioning and to monitor improvements in activities of daily living over time [19,20]. The 6MWT was also performed in accordance with the ATS/ERS guidelines [21]. The 6-min walking distance (6MWD) was reported in meters.
Since the 6MWD parameter is a good outcome measure in rehabilitation [22], a normalization was performed in order to obtain a class column to conduct ML analysis. For this purpose, all 6MWD values before and after rehabilitation were normalized in percentage depending on the theoretical maximum for each patient, as determined according to the ATS guidelines for the 6MWT [14] and considering age, sex, and body mass index (BMI), as follows:
6 MWD Maximum % _ before   =   6 MWD _ before 6 MWD _ Maximum × 100
6 MWD Maximum % _ after   =   6 MWD _ after 6 MWD _ Maximum × 100
6 MWD   =   6 MWD Maximum % after 6 MWD Maximum % before
Consequently, Δ6MWD has been partitioned into three classes corresponding to different degrees of improvement following rehabilitation:
  • Class 0: low improvement, between 0 and 20%;
  • Class 1: medium improvement, between 20 and 40%;
  • Class 2: high improvement, over 40%.
The IBM SPSS Statistics V. 27.0 system (Chicago, IL, USA) was used to compare demographic and clinical features of patients before and after rehabilitation through a univariate statistical analysis. The normality distribution of the data was assessed with the Kolmogorov–Smirnov test. Then a t-test for paired samples was performed for normally distributed data; otherwise, the Wilcoxon signed-rank test for paired samples was performed.

2.3. Pulmonary Rehabilitation Program

All enrolled patients underwent a 5-week PR program with daily sessions (6 sessions/week). Thus, a total of 30 sessions were planned, according to the official ATS/ERS guidelines [23]. PR consisted of physical exercise training, dietary counselling, and psychosocial counselling. Physical exercise training was the key point of the program, consisting of exercises to strengthen groups of muscles, treadmill walking, and stationary cycling. Lower- and upper-limb strengthening exercises were performed by using body and fixed weights at a load that could be supported for 8 to 10 repetitions before muscle exhaustion. Loads were increased when patients were able to complete 3 sets of 8–10 repetitions in two consecutive training sessions. Arm ergometry was planned for a 10 min/session at an intensity of 3 or 4 on the Rating of Perceived Exertion (RPE) 0 to 10 scale [24]. Treadmill walking duration was 15 min at PR initiation and was progressed to 30 min during the first 2 weeks, reaching an RPE score of 3 to 4. The intensity of lower-limb cycling intensity was set at an intensity aimed at scoring dyspnea or perceived exertion from 3 to 4 on the modified 0-to-10 category-ratio scale [24,25]. Patients also underwent flexibility and stretching exercises. A physiotherapist monitored and supervised participation.

2.4. Machine Learning Workflow

ML algorithms were implemented through KNIME analytics platform (v. 4.2.1), already successfully used in other biomedical studies [11,26,27]. In this study, a multiclass supervised classification learning was performed with tree-based and instance-based algorithms.
Overall, 189 instances were recorded, and a set of 30 features was chosen for modelling. Among them, 19 were continuous attributes and did not require a discretization, since they represented numerical clinical variables, and 11 were nominal attributes that were transformed in binary variables. A previous preprocessing was performed to replace missing values with rounded mean or most frequent value in numerical and categorical features, respectively. Then the Synthetic Minority Oversampling Technique (SMOTE) was applied in order to reduce data imbalance between classes: this technique oversamples minority classes by introducing synthetic samples along lines that join k-nearest neighbors of the same class [28]. Regardless, synthetic sample size was less than 50% of the entire dataset. The holdout method (70% training and 30% test) was used to train and validate random forest (RF), adaptive boosting (ADA-B), and gradient boosting (GB), as well as tree-based and k-nearest neighbors (KNN) as instance-based algorithms. Tree-based learning algorithms employ the decision-tree classifier, sorting instances down the tree from the root to some leaf nodes. These algorithms are a good method for discrete-valued target problems. Decision-tree learning methods usually perform well when instances are represented by attribute–value pairs and are robust to missing values and errors that could be contained in the training data. For the explained reasons, tree-based algorithms are generally better suited to medical and clinical issues where a classification is required. RF in an ensemble learning algorithm based on the bagging technique, which combines predictions of several trained models and returns the most voted result as an output. Another key concept of RF is randomization: each tree is trained on different and random sets of data and subsets of features [29]. ADA-B and GB are two ensemble and iterative learning algorithms that use the boosting principle. The former starts from a set of weak learners, usually decision stumps, and for each cycle, it assigns different weights to incorrect classifications, building a strong learner [30]. The latter optimizes weak learners results according to the gradient descent criterion: each single model is trained by minimizing the cost function, which, in this case, is the mean square error. Differently, instance-based algorithms use instances to perform classification tasks, assuming that similar instances have similar classifications, thus considering the most similar neighbors in terms of variables and attributes. Therefore, this method can be employed to explore a medical issue and to evaluate if different phenotypes depending on their characteristics can be identified. KNN is an instance-based model and one of the simplest classification algorithms that identifies similarity between k-neighbor samples by measuring their distances and then defines groups of k-similar samples. In this study, a Distance-Weighted KNN was employed, so the contribution of each attribute was evaluated according to its distance to the query point, and closer neighbors were greater weighted.
Feature importance was computed with RF to identify the most relevant features in classification through Information Gain (IG). IG is an entropy-based feature evaluation method, which considers how much information a feature can provide and how much this feature can be used in the classification process in order to measure its importance. In RF, feature importance is estimated by looking at how much prediction error increases when data for a certain variable are permuted while the others are left unchanged [31]. Then the IG of all the features was normalized and transformed into percentage in order to express and compare the contribution of each feature to the prediction.
Finally, the algorithms’ performances were evaluated through the following metrics, based on true negatives (TN), true positives (TP), total sample (TOT), false negatives (FN), and false positives (FP):
Accuracy   =   TN + TP TOT
Sensitivity   =   TP TP + FN
Specificity   =   TN TN + FP
AUROC = area under the receiver operating characteristics (ROC) curve (sensitivity − specificity).

3. Results

Among 197 patients screened for eligibility, three (1.5%) were ineligible for protocol adherence issues. A total of two (1.0%) out of the 194 eligible patients dropped out before completion of the project requirements, while three (1.5%) refused to sign the informed consent.
Therefore, the study population consisted of 189 convalescent COVID-19 patients (74.1% males, mean age 59.7 years). In Table 1, the baseline demographic and clinical characteristics pertaining to the acute phase of COVID-19 are reported.
As shown in Table 2, convalescent COVID-19 patients showed a significant improvement in the main pulmonary function parameters and exercise capacity after PR.
In detail, as compared to baseline, a significant increase in PaO2 was documented (p < 0.001). Moreover, an improvement in most spirometry parameters was reported at the end of the PR program, with FEV1% changing from 76.66% predicted ± 19.78 to 84.51% predicted ± 17.69 (p < 0.001) and FVC% from 74.34% predicted ± 19.82 to 81.73% predicted ± 16.77 (p < 0.001). Similarly, DLCO% and total lung capacity (TLC) significantly increased after PR, from 55.02% predicted ± 19.40 to 61.13% predicted ± 20.98 (p < 0.001), and from 4.58 L ± 1.35 to 5.82 L (p = 0.017), respectively. A significant and consistent improvement in exercise capacity was also documented at the end of the PR program, with 6MWD changing from 156.41 m ± 123.83 to 304.32 m ± 135.67 (p < 0.001). Finally, self-assessment measures of health status impairment (CAT) and functional limitation (Barthel score) also significantly improved after PR (p-value always < 0.05).

Machine Learning Anlysis

The three classes of normalized Δ6MWD corresponding to different degrees of improvement after PR were the following:
  • Class 0, low improvement: 64 patients;
  • Class 1, medium improvement: 95 patients;
  • Class 2, high improvement: 30 patients.
They were oversampled through SMOTE, for a total sample size of 285 patients.
The set of features was composed of the variables reported in Table 1 and Table 2 and were passed in input to ML algorithms. The evaluation metrics are summarized in Table 3 per each algorithm.
RF obtained the best results in terms of accuracy (83.7%), sensitivity (84.0%), and AUROC (94.5%), while ADA-B reached the highest specificity (92.7%). Figure 1 shows the ROC curve of the RF algorithm, with Class 0 being the positive class value.
Figure 2 shows the RF confusion matrix, which compares the predicted values and the actual values for each class, with the corresponding accuracies. Due to the well-balanced dataset, a high number of instances were correctly classified, as reported in the highlighted cells.
The 10 most important baseline features with the corresponding relative importance rankings are reported in Table 4.
Table 5 shows the distribution of these features among the three classes of improvement, with 6MWD, FEV1, FVC, FVC%, and PaO2 being significantly different between the three study groups (p always < 0.05).

4. Discussion

Based on an ML approach, the results of this study show the importance of some clinical and functional parameters in predicting the rehabilitation outcome in convalescent COVID-19 patients, expressed by an improved performance at 6MWT. Moreover, in line with previous evidence [32], our findings confirm the potential usefulness of multidisciplinary PR for COVID-19 patients in the post-acute care setting.
In this study, clinical and functional features of post-COVID-19 patients were explored through a univariate analysis and then employed as input for several ML algorithms in order to predict the percentage of improvement after rehabilitation. Our statistical analysis showed significant differences for the majority of functional and spirometry parameters before and after rehabilitation. The outcome measure identified for the ML analysis was the 6MWD, for which a normalization was performed. In detail, all the values were normalized depending on the theoretical maximum for each patient. Therefore, the evaluation was conducted on the basis of individual parameters and health status. After defining three ranges of improvement following rehabilitation, SMOTE was used to balance classes without altering the clinical significance of the dataset, and then ML algorithms were implemented.
Previous researchers focused on COVID-19 patients through a similar ML approach aimed at predicting mortality and stratifying risks correlated to comorbidities. For example, Gao et al. presented a prediction model trained and validated in over 2000 participants to stratify patients by mortality risk, using their clinical data on admission and obtaining an AUROC of 96.2% [33]. On the other hand, Hajifathalian et al. developed a prediction model to assess short-term mortality risk among hospitalized COVID-19 patients, based on patient age, hypoxia severity, mean arterial pressure, and presence of kidney disfunction. This model exhibited a similar performance in both internal (AUROC: 86.0%) and external validation (AUROC: 86.0%) [34]. Several other studies used ML to predict mortality [35] but also to evaluate the necessity of oxygen supplementation [36], to monitor pandemic-related psychopathology [37], to identify vaccine-related adverse events from Twitter data [38], and even to diagnose COVID-19 from cough audio signals [39]. However, to the best of our knowledge, no previous study used ML to predict the rehabilitation outcome in the post-acute phase of COVID-19.
Therefore, our study was the first specifically focusing on rehabilitation. The ML analysis was aimed at implementing algorithms that are able to predict clinical and functional improvements, overcoming good results in terms of accuracy (83.7%) and AUROC (94.5%). A relevant result of our model was the importance of certain spirometry and functional parameters as leading features in predicting the rehabilitative outcome in post-COVID-19 patients, expressed by an improved performance at the 6MWT. Beyond the baseline 6MWD, it is interesting to highlight the relevance of a number of pulmonary function parameters, including FEV1, FVC, DLCO, and TLC, potentially indicating restriction [40,41]. In detail, the strong interrelationship between FVC and lung volumes is one of the elements allowing us to clarify the nature of the lung damage, thus confirming the restrictive nature of the residual pulmonary involvement in our study population. Therefore, the fact that FVC and TLC are among the parameters that contribute the most to the prediction of the rehabilitation outcome suggests that, in line with previous evidence [32,42], there is a strong influence of the residual restrictive pattern on the disabling manifestations and the possibility of recovery after the acute phase. Accordingly, the role of DLCO as a main discriminating feature for the rehabilitation outcome emerges from our model, also suggesting the importance of interstitial or pulmonary vascular abnormalities in predicting the response to rehabilitation. Moreover, if the key prognostic role of some demographic variables is well established in COVID-19 [43], our prediction model further confirms the importance of age. Another noteworthy point is the presence of the CAT score as one of the features with greater relative importance in our model, as it suggests the key role of the self-assessment of health status improvement in predicting the rehabilitation outcome.
Of interest, although a significant difference between the three classes of improvement was observed for some—but not all—features, our findings indicate a better room for improvement among patients with greater functional limitations at baseline. This is partially in contrast with previous evidence on chronic obstructive pulmonary disease (COPD), showing that patients with COPD may benefit from PR regardless of disease severity [44]. In another study on 80 patients, in stark contrast to our results, 6MWD was directly correlated with the baseline FEV1, PaO2, and 6MWD [45]. Accordingly, Berry et al. previously documented an average increase in 6MWD after PR of 61.2, 72.7, and 34.2 m in mild, moderate, and severe COPD, respectively [46]. In our study on COVID-19 survivors, the fact that patients in group two presented a significantly lower functional status at baseline, as expressed by lower 6MWD and pulmonary parameters, indicates a negative association of most features chosen for modeling with the rehabilitation outcome. This apparently contrasting result may depend on the different nature of the disease, obstructive for COPD and mainly restrictive for COVID-19, on the different etiology, and, most important, on the different disease duration and course. While COPD is a chronic progressive disease, the current literature data suggest that most COVID-19 survivors may substantially improve their functional status [32,43], particularly following a rehabilitation program, with most patients showing no computed tomography abnormalities after 1 year from the acute phase [47], although this can often require a long time. This possibility of a consistent functional improvement may at least in part justify the lower functional status among patients with a higher degree of 6MWD improvement after PR.
Some potential limitations of our study should be addressed. Patients included in our protocol were all local residents from the Campania Region in Italy. This could somehow reduce the predictive value of our model, which would therefore need to be validated on other populations/ethnic groups. Moreover, the relatively low number of participants in our study suggests the need of further prospective studies on a larger sample in order to clarify which features chosen for modeling may have a positive or negative association with the rehabilitation outcome. Finally, since COVID-19 disease is proven to be a systematic disease that may also harm the cardiovascular and neurologic system, more clinical data from the cardiovascular, neurological, and myoskeletic system would help the assessment.

5. Conclusions

In addition to suggesting and confirming the favorable effect of rehabilitation on a range of functional parameters after the acute phase of COVID-19, our results support the importance of some clinical and demographic variables in predicting the rehabilitation outcome. Our model, despite needing further validation in larger external populations, could effectively assist clinicians in defining more personalized rehabilitation programs.

Author Contributions

Conceptualization, S.A. and P.A.; methodology, S.A., C.R., M.C. and G.d.; formal analysis, C.R., M.A. and M.M. (Marco Mosella); investigation, P.A., M.A., M.M. (Marco Mosella) and M.M. (Mauro Maniscalco); resources, C.R., M.A. and M.M. (Marco Mosella); data curation, P.A. and M.M. (Mauro Maniscalco); writing—original draft preparation, S.A., P.A., C.R., M.C., G.d. and M.M. (Mauro Maniscalco); writing—review and editing, all authors; supervision, G.d. and M.M. (Mauro Maniscalco).; project administration, C.R., M.C., G.d. and M.M. (Mauro Maniscalco); S.A. and P.A. are co-first authors; G.d. and M.M. (Mauro Maniscalco) share co-seniorship. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the “Ricerca Corrente” funding scheme of the Ministry of Health, Italy.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of “Istituto Nazionale Tumori, Fondazione Pascale”, Naples, Italy, with reference number ICS 11/20.

Informed Consent Statement

All patients provided written informed consent to use their de-identified data.

Data Availability Statement

The data are available upon request to the corresponding author.

Acknowledgments

We thank Anna Ciullo for technical support. Anna Ciullo provided written permission to be included in the acknowledgments section of this manuscript. Sarah Adamo wishes to thank the Gruppo per l’Armonizzazione delle Reti della Ricerca (GARR) for her research grant.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef] [Green Version]
  2. Kamal, M.; Abo Omirah, M.; Hussein, A.; Saeed, H. Assessment and characterisation of post-COVID-19 manifestations. Int. J. Clin. Pract. 2021, 75, e13746. [Google Scholar] [CrossRef] [PubMed]
  3. Amdal, C.D.; Pe, M.; Falk, R.S.; Piccinin, C.; Bottomley, A.; Arraras, J.I.; Darlington, A.S.; Hofso, K.; Holzner, B.; Jorgensen, N.M.H.; et al. Health-related quality of life issues, including symptoms, in patients with active COVID-19 or post COVID-19; a systematic literature review. Qual. Life Res. 2021, 30, 3367–3381. [Google Scholar] [CrossRef] [PubMed]
  4. Ambrosino, P.; Fuschillo, S.; Papa, A.; Di Minno, M.N.D.; Maniscalco, M. Exergaming as a Supportive Tool for Home-Based Rehabilitation in the COVID-19 Pandemic Era. Games Health J. 2020, 9, 311–313. [Google Scholar] [CrossRef] [PubMed]
  5. Gloeckl, R.; Leitl, D.; Jarosch, I.; Schneeberger, T.; Nell, C.; Stenzel, N.; Vogelmeier, C.F.; Kenn, K.; Koczulla, A.R. Benefits of pulmonary rehabilitation in COVID-19: A prospective observational cohort study. ERJ Open Res. 2021, 7, 00108. [Google Scholar] [CrossRef]
  6. Buckley, B.; Harrison, S.L.; Fazio-Eynullayeva, E.; Underhill, P.; Jones, I.D.; Williams, N.; Lip, G. Exercise rehabilitation associates with lower mortality and hospitalisation in cardiovascular disease patients with COVID-19. Eur. J. Prev. Cardiol. 2021, 29, e32–e34. [Google Scholar] [CrossRef]
  7. Spruit, M.A.; Holland, A.E.; Singh, S.J.; Tonia, T.; Wilson, K.C.; Troosters, T. COVID-19: Interim Guidance on Rehabilitation in the Hospital and Post-Hospital Phase from a European Respiratory Society and American Thoracic Society-coordinated International Task Force. Eur. Respir. J. 2020, 56, 2002197. [Google Scholar] [CrossRef]
  8. Demeco, A.; Marotta, N.; Barletta, M.; Pino, I.; Marinaro, C.; Petraroli, A.; Moggio, L.; Ammendolia, A. Rehabilitation of patients post-COVID-19 infection: A literature review. J. Int. Med. Res. 2020, 48, 300060520948382. [Google Scholar] [CrossRef]
  9. Solway, S.; Brooks, D.; Lacasse, Y.; Thomas, S. A qualitative systematic overview of the measurement properties of functional walk tests used in the cardiorespiratory domain. Chest 2001, 119, 256–270. [Google Scholar] [CrossRef]
  10. Zhang, Q.; Lu, H.; Pan, S.; Lin, Y.; Zhou, K.; Wang, L. 6MWT Performance and its Correlations with VO(2) and Handgrip Strength in Home-Dwelling Mid-Aged and Older Chinese. Int. J. Environ. Res. Public Health 2017, 14, 473. [Google Scholar] [CrossRef] [Green Version]
  11. Scrutinio, D.; Ricciardi, C.; Donisi, L.; Losavio, E.; Battista, P.; Guida, P.; Cesarelli, M.; Pagano, G.; D’Addio, G. Machine learning to predict mortality after rehabilitation among patients with severe stroke. Sci. Rep. 2020, 10, 20127. [Google Scholar] [CrossRef] [PubMed]
  12. Fontana, M.A.; Lyman, S.; Sarker, G.K.; Padgett, D.E.; MacLean, C.H. Can Machine Learning Algorithms Predict Which Patients Will Achieve Minimally Clinically Important Differences From Total Joint Arthroplasty? Clin. Orthop. Relat. Res. 2019, 477, 1267–1279. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Inan, O.T.; Baran Pouyan, M.; Javaid, A.Q.; Dowling, S.; Etemadi, M.; Dorier, A.; Heller, J.A.; Bicen, A.O.; Roy, S.; De Marco, T.; et al. Novel Wearable Seismocardiography and Machine Learning Algorithms Can Assess Clinical Status of Heart Failure Patients. Circ. Heart Fail. 2018, 11, e004313. [Google Scholar] [CrossRef] [PubMed]
  14. Kassania, S.H.; Kassanib, P.H.; Wesolowskic, M.J.; Schneidera, K.A.; Detersa, R. Automatic Detection of Coronavirus Disease (COVID-19) in X-ray and CT Images: A Machine Learning Based Approach. Biocybern. Biomed. Eng. 2021, 41, 867–879. [Google Scholar] [CrossRef] [PubMed]
  15. Nguyen, S.; Chan, R.; Cadena, J.; Soper, B.; Kiszka, P.; Womack, L.; Work, M.; Duggan, J.M.; Haller, S.T.; Hanrahan, J.A.; et al. Budget constrained machine learning for early prediction of adverse outcomes for COVID-19 patients. Sci. Rep. 2021, 11, 19543. [Google Scholar] [CrossRef]
  16. von Elm, E.; Altman, D.G.; Egger, M.; Pocock, S.J.; Gotzsche, P.C.; Vandenbroucke, J.P.; Initiative, S. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. Ann. Intern. Med. 2007, 147, 573–577. [Google Scholar] [CrossRef] [Green Version]
  17. Laszlo, G. Standardisation of lung function testing: Helpful guidance from the ATS/ERS Task Force. Thorax 2006, 61, 744–746. [Google Scholar] [CrossRef] [Green Version]
  18. Macintyre, N.; Crapo, R.O.; Viegi, G.; Johnson, D.C.; van der Grinten, C.P.; Brusasco, V.; Burgos, F.; Casaburi, R.; Coates, A.; Enright, P.; et al. Standardisation of the single-breath determination of carbon monoxide uptake in the lung. Eur. Respir. J. 2005, 26, 720–735. [Google Scholar] [CrossRef]
  19. Collin, C.; Wade, D.T.; Davies, S.; Horne, V. The Barthel ADL Index: A reliability study. Int. Disabil. Stud. 1988, 10, 61–63. [Google Scholar] [CrossRef]
  20. Karloh, M.; Fleig Mayer, A.; Maurici, R.; Pizzichini, M.M.M.; Jones, P.W.; Pizzichini, E. The COPD Assessment Test: What Do We Know So Far?: A Systematic Review and Meta-Analysis About Clinical Outcomes Prediction and Classification of Patients Into GOLD Stages. Chest 2016, 149, 413–425. [Google Scholar] [CrossRef]
  21. Holland, A.E.; Spruit, M.A.; Troosters, T.; Puhan, M.A.; Pepin, V.; Saey, D.; McCormack, M.C.; Carlin, B.W.; Sciurba, F.C.; Pitta, F.; et al. An official European Respiratory Society/American Thoracic Society technical standard: Field walking tests in chronic respiratory disease. Eur. Respir. J. 2014, 44, 1428–1446. [Google Scholar] [CrossRef] [PubMed]
  22. Laboratories ATSCoPSfCPF. ATS statement: Guidelines for the six-minute walk test. Am. J. Respir. Crit. Care Med. 2002, 166, 111–117. [Google Scholar] [CrossRef] [PubMed]
  23. Rochester, C.L.; Vogiatzis, I.; Holland, A.E.; Lareau, S.C.; Marciniuk, D.D.; Puhan, M.A.; Spruit, M.A.; Masefield, S.; Casaburi, R.; Clini, E.M.; et al. An Official American Thoracic Society/European Respiratory Society Policy Statement: Enhancing Implementation, Use, and Delivery of Pulmonary Rehabilitation. Am. J. Respir. Crit. Care Med. 2015, 192, 1373–1386. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Borg, G.A. Psychophysical bases of perceived exertion. Med. Sci. Sports Exerc. 1982, 14, 377–381. [Google Scholar] [CrossRef] [PubMed]
  25. Zainuldin, R.; Mackey, M.G.; Alison, J.A. Prescribing Cycle Exercise Intensity Using Moderate Symptom Levels in Chronic Obstructive Pulmonary Disease. J. Cardiopulm. Rehabil. Prev. 2016, 36, 195–202. [Google Scholar] [CrossRef] [PubMed]
  26. Ricciardi, C.; Valente, A.S.; Edmund, K.; Cantoni, V.; Green, R.; Fiorillo, A.; Picone, I.; Santini, S.; Cesarelli, M. Linear discriminant analysis and principal component analysis to predict coronary artery disease. Health Inform. J. 2020, 26, 2181–2192. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Stanzione, A.; Ricciardi, C.; Cuocolo, R.; Romeo, V.; Petrone, J.; Sarnataro, M.; Mainenti, P.P.; Improta, G.; De Rosa, F.; Insabato, L.; et al. MRI Radiomics for the Prediction of Fuhrman Grade in Clear Cell Renal Cell Carcinoma: A Machine Learning Exploratory Study. J. Digit. Imaging 2020, 33, 879–887. [Google Scholar] [CrossRef]
  28. Nakamura, M.; Kajiwara, Y.; Otsuka, A.; Kimura, H. LVQ-SMOTE-Learning Vector Quantization based Synthetic Minority Over-sampling Technique for biomedical data. BioData Min. 2013, 6, 16. [Google Scholar] [CrossRef] [Green Version]
  29. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  30. Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
  31. Lei, S. A Feature Selection Method Based on Information Gain and Genetic Algorithm. In Proceedings of the International Conference on Computer Science and Electronics Engineering, Hangzhou, China, 23–25 March 2012; pp. 355–358. [Google Scholar]
  32. Ambrosino, P.; Molino, A.; Calcaterra, I.; Formisano, R.; Stufano, S.; Spedicato, G.A.; Motta, A.; Papa, A.; Di Minno, M.N.D.; Maniscalco, M. Clinical Assessment of Endothelial Function in Convalescent COVID-19 Patients Undergoing Multidisciplinary Pulmonary Rehabilitation. Biomedicines 2021, 9, 614. [Google Scholar] [CrossRef] [PubMed]
  33. Gao, Y.; Cai, G.Y.; Fang, W.; Li, H.Y.; Wang, S.Y.; Chen, L.; Yu, Y.; Liu, D.; Xu, S.; Cui, P.F.; et al. Machine learning based early warning system enables accurate mortality risk prediction for COVID-19. Nat. Commun. 2020, 11, 5033. [Google Scholar] [CrossRef] [PubMed]
  34. Hajifathalian, K.; Sharaiha, R.Z.; Kumar, S.; Krisko, T.; Skaf, D.; Ang, B.; Redd, W.D.; Zhou, J.C.; Hathorn, K.E.; McCarty, T.R.; et al. Development and external validation of a prediction risk model for short-term mortality among hospitalized U.S. COVID-19 patients: A proposal for the COVID-AID risk tool. PLoS ONE 2020, 15, e0239536. [Google Scholar] [CrossRef] [PubMed]
  35. Jamshidi, E.; Asgary, A.; Tavakoli, N.; Zali, A.; Setareh, S.; Esmaily, H.; Jamaldini, S.H.; Daaee, A.; Babajani, A.; Sendani Kashi, M.A.; et al. Using Machine Learning to Predict Mortality for COVID-19 Patients on Day 0 in the ICU. Front Digit Health 2022, 3, 681608. [Google Scholar] [CrossRef]
  36. Saadatmand, S.; Salimifard, K.; Mohammadi, R.; Marzban, M.; Naghibzadeh-Tahami, A. Predicting the necessity of oxygen therapy in the early stage of COVID-19 using machine learning. Med. Biol. Eng. Comput. 2022, 1–12. [Google Scholar] [CrossRef]
  37. Enevoldsen, K.C.; Danielsen, A.A.; Rohde, C.; Jefsen, O.H.; Nielbo, K.L.; Østergaard, S.D. Monitoring of COVID-19 Pandemic-related Psychopathology using Machine Learning. Acta Neuropsychiatr. 2022, 1–14. [Google Scholar] [CrossRef]
  38. Lian, A.T.; Du, J.; Tang, L. Using a Machine Learning Approach to Monitor COVID-19 Vaccine Adverse Events (VAE) from Twitter Data. Vaccines (Basel) 2022, 10, 103. [Google Scholar] [CrossRef]
  39. Hemdan, E.E.; El-Shafai, W.; Sayed, A. CR19: A framework for preliminary detection of COVID-19 in cough audio signals using machine learning algorithms for automated medical diagnosis applications. J. Ambient. Intell. Humaniz. Comput. 2022, 1–13. [Google Scholar] [CrossRef]
  40. Ruppel, G.L. What is the clinical value of lung volumes? Respir. Care 2012, 57, 26–35. [Google Scholar] [CrossRef] [Green Version]
  41. Owens, M.W.; Kinasewitz, G.T.; Anderson, W.M. Clinical significance of an isolated reduction in residual volume. Am. Rev. Respir. Dis. 1987, 136, 1377–1380. [Google Scholar] [CrossRef]
  42. Ambrosino, P.; Calcaterra, I.; Molino, A.; Moretta, P.; Lupoli, R.; Spedicato, G.A.; Papa, A.; Motta, A.; Maniscalco, M.; Di Minno, M.N.D. Persistent Endothelial Dysfunction in Post-Acute COVID-19 Syndrome: A Case-Control Study. Biomedicines 2021, 9, 957. [Google Scholar] [CrossRef] [PubMed]
  43. Ambrosino, P.; Papa, A.; Maniscalco, M.; Di Minno, M.N.D. COVID-19 and functional disability: Current insights and rehabilitation strategies. Postgrad Med. J. 2021, 97, 469–470. [Google Scholar] [CrossRef] [PubMed]
  44. Takigawa, N.; Tada, A.; Soda, R.; Takahashi, S.; Kawata, N.; Shibayama, T.; Matsumoto, H.; Hamada, N.; Hirano, A.; Kimura, G.; et al. Comprehensive pulmonary rehabilitation according to severity of COPD. Respir. Med. 2007, 101, 326–332. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Shehata, S.M.R.M.; Al Gabry, M.M.; Nafae, R.M. Outcome of pulmonary rehabilitation in patients with stable chronic obstructive pulmonary disease at Chest Department, Zagazig University Hospitals (2014–2016). Egypt. J. Bronchol. 2018, 12, 279–287. [Google Scholar] [CrossRef]
  46. Berry, M.J.; Rejeski, W.J.; Adair, N.E.; Zaccaro, D. Exercise rehabilitation and chronic obstructive pulmonary disease stage. Am. J. Respir. Crit. Care Med. 1999, 160, 1248–1253. [Google Scholar] [CrossRef] [PubMed]
  47. Pan, F.; Yang, L.; Liang, B.; Ye, T.; Li, L.; Li, L.; Liu, D.; Wang, J.; Hesketh, R.L.; Zheng, C. Chest CT Patterns from Diagnosis to 1 Year of Follow-up in COVID-19. Radiology 2021, 211199. [Google Scholar] [CrossRef]
Figure 1. Receiver Operating Characteristic (ROC) curve of RF algorithm (blue line); ROC = 0.5, threshold for considering the model better than random guessing (black line).
Figure 1. Receiver Operating Characteristic (ROC) curve of RF algorithm (blue line); ROC = 0.5, threshold for considering the model better than random guessing (black line).
Jpm 12 00328 g001
Figure 2. Confusion matrix of random forest (RF) algorithm.
Figure 2. Confusion matrix of random forest (RF) algorithm.
Jpm 12 00328 g002
Table 1. Baseline demographic and clinical characteristics of post-acute COVID-19 patients.
Table 1. Baseline demographic and clinical characteristics of post-acute COVID-19 patients.
Patients, N189
Age, years59.7 ± 10.4
Female, N49
Smokers, N14
BMI, Kg/m229.1 ± 6.1
Hospitalization length, days17.6 ± 15.2
Days from a negative swab22.6 ± 17.8
High flow oxygen, N42
Mechanical ventilation, N47
Hypertension, N86
Hypercholesterolemia, N18
Hypertriglyceridemia, N12
Diabetes, N32
Heart failure, N18
Atrial fibrillation, N5
History of stroke/TIA, N4
BMI, body mass index; TIA, transient ischemic attack.
Table 2. Main clinical features and pulmonary function tests before and after pulmonary rehabilitation (PR) in 189 post-acute COVID-19 patients.
Table 2. Main clinical features and pulmonary function tests before and after pulmonary rehabilitation (PR) in 189 post-acute COVID-19 patients.
Before PRAfter PRp-Value
PaO2, mmHg73.48 ± 14.9880.91 ± 14.20<0.001
PaCO2, mmHg36.18 ± 5.3736.94 ± 3.640.002
pH7.45 ± 0.057.43 ± 0.04<0.001
FEV1, L2.34 ± 0.762.65 ± 0.75<0.001
FEV1%, % predicted76.66 ± 19.7884.51 ± 17.69<0.001
FVC, L2.84 ± 0.963.19 ± 0.90<0.001
FVC%, %predicted74.34 ± 19.8281.73 ± 16.77<0.001
FEV1 / FVC81.88 ± 9.7081.15 ± 9.52<0.001
RV, L1.36 ± 0.731.43 ± 0.860.123
TLC, L4.58 ± 1.355.82 ± 1.270.017
DLCO, mL/min/mmHg10.71 ± 7.4310.17 ± 8.150.002
DLCO%, % predicted55.02 ± 19.4061.13 ± 20.98<0.001
6MWD, meters156.41 ± 123.83304.32 ± 135.67<0.001
CAT26.68 ± 3.259.51 ± 4.66<0.001
Barthel67.96 ± 29.6894.34 ± 13.10<0.001
PaO2, arterial oxygen tension; PaCO2, arterial carbon dioxide tension; pH, power of hydrogen; FEV1, forced expiratory volume in 1 s; FVC, forced vital capacity; RV, residual volume; TLC, total lung capacity; DLCO, diffusion lung of carbon monoxide; 6MWD, 6-min walk distance; CAT, COPD Assessment Test. Data are presented as mean ± standard deviation unless otherwise indicated.
Table 3. Evaluation metrics for each algorithm.
Table 3. Evaluation metrics for each algorithm.
AlgorithmAccuracySensitivity
(%)
Specificity
(%)
AUROC
(%)
RF83.784.091.894.5
ADA-B81.471.092.788.5
GB79.171.087.384.6
KNN80.274.289.193.4
AUROC, area under the receiver operating characteristic curve; RF, random forest; ADA-B, adaptive boosting; GB, gradient boosting; KNN, k-nearest neighbors.
Table 4. Features information gain (IG) normalized and transformed into percentage for the 10 most important features chosen for modeling.
Table 4. Features information gain (IG) normalized and transformed into percentage for the 10 most important features chosen for modeling.
FeatureIG
6MWD, meters10.62%
DLCO%, % predicted6.25%
FVC, L5.85%
DLCO, mL/min/mmHg5.09%
FEV1, L4.68%
PaO2, mmHg4.67%
TLC, L4.59%
CAT4.57%
Age, years4.53%
FVC%, % predicted4.41%
6MWD, 6-min walking distance; DLCO, diffusing lung capacity for carbon monoxide; FVC, forced vital capacity; FEV1, forced expiratory volume in 1 s; PaO2, arterial oxygen tension; TLC, total lung capacity.
Table 5. Comparisons among the three classes of improvement following PR, according to the 10 most important features.
Table 5. Comparisons among the three classes of improvement following PR, according to the 10 most important features.
FeaturesGroup 0
(n = 64)
Group 1
(n = 95)
Group 2
(n = 30)
p-Value
6MWD, meters193.13 ± 131.77171.20 ± 90.1631.10 ± 56.69<0.001
DLCO%, % predicted55.70 ± 15.6356.27 ± 12.8249.97 ± 11.580.230
FVC, L2.97 ± 0.622.92 ± 0.852.42 ± 0.650.001
DLCO, mL/min/mmHg11.73 ± 6.7710.97 ± 5.1910.47 ± 4.110.682
FEV1, L2.44 ± 0.522.34 ± 0.672.03 ± 0.580.003
PaO2, mmHg75.19 ± 13.1272.89 ± 12.3365.03 ± 13.320.005
TLC, L4.51 ± 0.854.71 ± 1.024.53 ± 1.110.736
CAT26.92 ± 2.5326.61 ± 1.3627.00 ± 3.530.163
Age, years62.56 ± 12.8462.88 ± 8.62 63.97 ± 9.180.972
FVC%, % predicted76.81 ± 15.5676.57 ± 14.8364.33 ± 14.22<0.001
6MWD, 6-min walking distance; DLCO, diffusing lung capacity for carbon monoxide; FVC, forced vital capacity; FEV1, forced expiratory volume in 1 s; PaO2, arterial oxygen tension; TLC, total lung capacity.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Adamo, S.; Ambrosino, P.; Ricciardi, C.; Accardo, M.; Mosella, M.; Cesarelli, M.; d’Addio, G.; Maniscalco, M. A Machine Learning Approach to Predict the Rehabilitation Outcome in Convalescent COVID-19 Patients. J. Pers. Med. 2022, 12, 328. https://doi.org/10.3390/jpm12030328

AMA Style

Adamo S, Ambrosino P, Ricciardi C, Accardo M, Mosella M, Cesarelli M, d’Addio G, Maniscalco M. A Machine Learning Approach to Predict the Rehabilitation Outcome in Convalescent COVID-19 Patients. Journal of Personalized Medicine. 2022; 12(3):328. https://doi.org/10.3390/jpm12030328

Chicago/Turabian Style

Adamo, Sarah, Pasquale Ambrosino, Carlo Ricciardi, Mariasofia Accardo, Marco Mosella, Mario Cesarelli, Giovanni d’Addio, and Mauro Maniscalco. 2022. "A Machine Learning Approach to Predict the Rehabilitation Outcome in Convalescent COVID-19 Patients" Journal of Personalized Medicine 12, no. 3: 328. https://doi.org/10.3390/jpm12030328

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop