Next Article in Journal
Analysis of a Discrete-Time Queueing Model with Disasters
Previous Article in Journal
Analysis and Synchronization of a New Hyperchaotic System with Exponential Term
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fuzzy Decision Tree Based Method in Decision-Making of COVID-19 Patients’ Treatment

1
Department of Informatics, University of Žilina, Univerzitná 8215/1, 01026 Žilina, Slovakia
2
Department of Otolaryngology and Head and Neck Surgery, Guy’s & St, Thomas’ NHS Foundation Trust, Great Maze Pond, London SE1 9RT, UK
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(24), 3282; https://doi.org/10.3390/math9243282
Submission received: 24 November 2021 / Revised: 8 December 2021 / Accepted: 15 December 2021 / Published: 17 December 2021
(This article belongs to the Section Mathematics and Computer Science)

Abstract

:
A new method in decision-making of timing of tracheostomy in COVID-19 patients is developed and discussed in this paper. Tracheostomy is performed in critically ill coronavirus disease (COVID-19) patients. The timing of tracheostomy is important for anticipated prolonged ventilatory wean when levels of respiratory support were favorable. The analysis of this timing has been implemented based on classification method. One of principal conditions for the developed classifiers in decision-making of timing of tracheostomy in COVID-19 patients was a good interpretation of result. Therefore, the proposed classifiers have been developed as decision tree based because these classifiers have very good interpretability of result. The possible uncertainty of initial data has been considered by the application of fuzzy classifiers. Two fuzzy classifiers as Fuzzy Decision Tree (FDT) and Fuzzy Random Forest (FRF) have been developed for the decision-making in tracheostomy timing. The evaluation of proposed classifiers and their comparison with other show the efficiency of the proposed classifiers. FDT has best characteristics in comparison with other classifiers.

1. Introduction

Any medical decision, be it diagnosis or a patient’s response to treatments, is a difficult task. Clinicians take the decision depending on experience and knowledge, which are sometimes intuitive and non-formalized. A medical decision is formed, taking into account information about body temperature, blood pressure, key biomarkers, treatment history, behavioral preferences, environmental factors, demographics, genetic composition, and other useful information. This information is observed and organized based on initial collected data. Data can be structural, nonstructural, or semi-structural with different qualities, such as nominal, numerical, Boolean, date-time, image, text, video, or sound. Computer-aided procedures can facilitate medical decision-making which usually implemented based on classification techniques and methods. The classification methods used for the medical decision-making should be able to complete missing values, as well as cope with both losses of information and data during the transformation process. Such classification methods usually developed based approaches of Machine Learning [1,2,3,4].
The methods of Machine Leaning are widely used in medicine for storing and processing data about a patient in form of electronic health record [5,6], diagnostics [7,8,9], treatment supporting [3,4,10,11], and a personalized approach to a patient (personalized medicine) [2,10].
The most of these methods are developed based on classification. The classifiers of different types are widely used in many medical decision-making applications, for example: Bayes classification [12,13], Decision Tree [3,9,14,15], Artificial (Convolution) Neural Network and Deep Leaning [5,16,17,18], Linear Regression [19], and Support Vector Machine (SVM) [15]. According to [20] these classification methods can be considered as data-driven learned from empirical observations or knowledge-based inducted upon prior insights. Most often used data-driven methods for the classification are based on Neural Network or Deep Leaning. Classification methods based on Decision Tree, Bayes classification, and SVM approaches typically used to develop of knowledge-based classifiers. The Deep Learning approaches have become the most commonly applied in last studies for developing classification methods, including medical applications. These methods are really very effective in application when large quantities of data are available [21,22,23]. However, the collection of such data for learning can have difficult. At the same time, as authors of [20] accent, that “clinicians need complete computational transparency and reproducibility of formed decision” by classifier or other decision support system. In these cases, knowledge-based methods for the classifiers elaboration are more suitable.
A Decision Tree is one of the best ways to visualize a decision [6,24]. A Decision Tree can be interpreted in the form of ordinary human reasoning, when the next step in decision-making process depends on the information analysis from previous steps. A Decision Tree is a hierarchical classifier, which is very simple for understanding and interpreting. Another advantage of Decision Tree is not requirement of input data normalization and special pre-processing to cause their independence. Although, a decision interpretability is more important advantage of Decision Tree in considered problem.
There are some types of Decision Tree based classifiers. Decision Tree, Random Forest, Fuzzy Decision Tree (FDT), and Fuzzy Random Forest (FRF) are the most often used classifiers of this group [24,25,26]. All these classifiers can be created based on data in which input attributes has missing values, i.e., based on incompletely specified data. However, problems with the original data can also be associated with other uncertainties, such as vagueness and ambiguity, because data for the analysis (classification) in medicine is not numerical only and they can be linguistic/symbolic [27,28,29]. The ambiguity is indicated if a term or concept can be interpreted in more than one way. One of the contexts in medical practice, in which ambiguity is typical, is medical consultation, where the interaction of clinician and patient and their utterances can have ambiguity [27,30]. Vagueness is inherent in linguistic data if the boundaries of a word’s meaning are not well defined (for example, word “high”), or numeric data if their values are near the boundary value (for example, if the boundary value is 35 and there are values 34.9 and 35.1). The five causes of the vagueness in medicine are indicated in [29], including, for example, vague terminology, and handling vague data. Data vagueness can be taken into account in development of decision-making system by the exploitation of method based on fuzzy logic [28,29,31].
Fuzzy logic widely used in medical application including classification methods [25,32,33]. This logic allows describing and analyzing booth numerical and linguistic data. he application of fuzzy logic in medical application permits to implement uncertain or approximate reasoning in form of computer-aided procedures and products. Authors of study [18] showed that the fuzzy logic application can be effective strategy in diagnosis of COVID-19, because it is difficult to propose unambiguous guidelines for diagnostic and treatment disease in pandemic time and the fuzzy logic is able to represent reasoning in which can be some uncertainty. There are studies of the fuzzy classifiers use in COVID-19 detection and decision-making in diagnosis and treatment [8,17,18,34,35,36,37]. The most often used classifier in these studies is (convolutional) neural network, which is applied for the classification of X-ray [17,35,36,37] or CT [8,34] images. However, this classifier has not required interpretability, which is important in medical application. In paper [18] the authors propose new Hybrid Diagnose Strategy, which in additional to the ranking selected features for the classification, makes it possible to form decision rules for classification. In this paper we propose to consider other classifiers, which are good interpretability: FDT and FRF. These classifiers are developed to facilitate the decision-making in timing of tracheostomy for prolonged respiratory wean in critically ill COVID-19 patients. According to the study [38] this problem has been considered based on analysis of relationship between duration of mechanical ventilation prior to tracheostomy insertion and in-hospital mortality. The classification method has been used for this problem decision. The decision tree based C4.5 method has been inducted and used for the prediction of the tracheostomy time. In this paper this method is developed by the application of fuzzy classifiers (in particular, FDT and FRF). The application of the fuzzy classifier instead of crisp classifier allows increase the accuracy of decision. The study [39,40] for classification and analysis of EEG signal based on fuzzy classifier (FDT) instead crisp classifier shown the improving of classification accuracy. The method based on fuzzy classification has been proposed in [39] for signal classification. The specific of input data (signal) causes special procedures at the step of signal preprocessing, which feature extraction and dimensional reduction. These procedures have been implemented based on Welch’s Method and Principal Component Analysis method accordantly. In this study the initial data is not data and has other structure and properties. Therefore, new method for decision-making of timing of tracheostomy in COVID-19 patients should be developed, which takes in consideration the specific data COVID-19 patients.
This paper is organized as follows. Section 2 describes problem and data for the analysis. Section 3 discuss methods for FDT and FRF induction. These classifiers are developed based on real records of COVID-19 patients and take into consideration all specifics of this data. Section 4 introduces the set of metrics for the proposed classifier evaluation. Section 5 focuses on the accuracy of inducted FDT and FRF evaluation and comparison these classifiers with other.

2. Data Collection

The data for this study was collected from real records of COVID-19 patients in ICU at Guy’s and St Thomas’ National Health Service (NHS) Foundation Trust. The dataset consists of 177 anonymous patients’ records with laboratory-confirmed COVID-19 disease. These data are described by 29 input attributes which are divided into two groups: attributes formed based on baseline characteristics and attributes obtained during 14 days of hospitalization. The analysis objective was focused on the prediction of a patient’s survival. The additional information about the considered and analyzed data has been introduced in the paper [38], where medical aspects of this data has been presented in detail.
Baseline dataset characteristics are interpreted as input attributes which can be obtained immediately after admission of a patient or by analysis of patient’s health records. The attributes of the first part obtained immediately are age, gender, ethnicity, Body Mass Index (BMI), and the Acute Physiology and Chronic Health Evaluation II (APACHE II) score. The baseline characteristic from patient’s health records can be as confirmation of diabetes, mellitus, hypertension, ischemic heart disease, chronic obstructive pulmonary disease, asthma, and chronic kidney disease. One more input attribute of the baseline characteristics is thromboembolism (pulmonary, venous, or multiple). These attributes formed based on baseline characteristics are categorical (except age and BMI which are numerical).
The second group of input attributes is collected from vital signs, markers of acute respiratory failure, and serum-based biomarkers severity disease. These attributes were measured repeatedly in different time points (after 24 h of clinical care, on days 7, 10, and 14). If some patient died or was disconnected from mechanical ventilation, the last measured values were used. These attributes are named PEEP (Positive End-Expiratory Pressure), FiO2 (Fraction of Inspired Oxygen), PaO2 (Partial Pressure of Oxygen), PF Ratio, CRP (C-reactive Protein), Ferritin, D_Dimer, Temperature, Vasopressors, RRT (Renal Replacement Therapy), and ECMO (Extracorporeal Membrane Oxygenation). The attributes of the second group are numerical.
The input attributes for the considered problem are categorical and numerical. The categorical data about patient can have ambiguity, for example, as problem of unambiguous interpretability of information in health records. The measured numerical data can have vagueness which, for example, can be caused by measuring device error. Therefore, the fuzzy classification is more acceptable for the considered problem to facilitate decision-making for optimal timing of tracheostomy for prolonged respiratory wean in critically ill COVID-19 patients.

3. Fuzzy Classifier for Facilitate Decision-Making in Treatment of Patients with Coronavirus Pneumonia

The problem of COVID-19 diagnosis is one of relevant problems which is investigated intensively with the use of Machine Learning approaches. The classification methods are background of decision-making procedures of COVID-19 diagnosis [8]. The method for the classifier induction is principal step in development of decision-making procedure. In some studies of COVID-19, fuzzy classifiers are developed. For example, in [18] detecting COVID-19 patients based on a fuzzy inference engine was conducted. There are other studies, that use the fuzzy approach [17,18,34,35,37]. This study proves that fuzzy classification allows to obtain better results as crisp based classification methods. However, investigations of fuzzy classification in decision-making support of COVID-19 treatment are not sufficient [38,40,41].
We proposed the fuzzy classification based on FDT for decision-making in the optimal timing of tracheostomy insertion in COVID-19 patients as one of the possible fuzzy classification-based methods. This investigation has been started in [38]. In [38], decision-making procedure of patients’ treatment has been considered based on development of decision tree (the algorithm C4.5 has been used for the classifier induction).

3.1. Fuzzy Classifier Induction

The use of the fuzzy classifier causes the representation of initial data in form of fuzzy data [39]. The data fuzzification allows for taking into account uncertainty of initial data. Fuzzy data not only provide confident description of initial data, but also offers the results in form most consistent with conventional reasoning [42]. The initial data must be transformed by the special procedure of the preprocessing if they are not fuzzy. This preprocessing is implemented by procedure of fuzzification. Preprocessing (fuzzification) is important to extract more detailed information from data, and take into account the possible ambiguity, vagueness, and impurity of initial data. Therefore, the procedure of the preprocessing as initial data fuzzification is introduced in the method for fuzzy classifier design (Figure 1). The output of the fuzzification is fuzzy data, which is interpreted as input attributes for the classification.
The similar conception of fuzzy classifier application has been in study [39] for the EEG signal classification. However, the preprocessing of signal in classification based on fuzzy classifier has other structure and includes additional procedures of feature extraction (based on Welch transformation) and dimensional reduction (based on the Principal Component Analysis method).
The fuzzification and fuzzy classifier can be implemented based on different approaches. In this investigation the Fuzzy c-means (FCM) algorithm is used for the initial data fuzzification. FDT and FRF are proposed for the classification as decision tree based classifier with well interpretability and visibility of the result.

3.2. Fuzzyfication of Initial Data

We propose to use a fuzzy classifier for solving the task of timing tracheostomy. For this purpose, we have to transform the initial numeric data into fuzzy data. It is allowing to partially blur the uncertainty and the uncertainty of initial data. Such a transformation process is called fuzzification. The fuzzification replaced numeric attribute X i into a fuzzy attribute A i , ( i = 1 , , n ) which was described by m i ( m i 2 ) linguistic terms. These linguistic terms are described as fuzzy sets. So, we have L instances of the numerical attribute values which are defined as a vector of real numeric values ( x 1 , x 2 , , x l , , x L ) . Fuzzification transforms each l-th numeric value of this vector (l = 1, …, L) into membership degrees of m i linguistic terms. As result, the value of an instance of the fuzzy attribute is determined by these values of membership function of each of m i linguistic terms. So, j-th linguistic term of A i is represented by fuzzy set A i , j ( j = 1 , , m i ) . Fuzzy set A i , j with respect to X i is defined by a membership function μ A i , j ( x ) : X i ( 0 , 1 ) . The membership function contents a membership degree for each x (xXi), which defines how strongly element x is the member of fuzzy set A i , j . Formally, fuzzy set A i , j is defined as an ordered set of pairs A i , j = { ( x , μ A i , j ( x ) ) , x X i } . The next three rules accepted in this notation:
(a)
μ A i , j ( x ) = 0 if and only if x is not the member of set A i , j ;
(b)
0 < μ A i , j ( x ) < 1 if and only if x is not the full member of set A i , j ;
(c)
μ A i , j ( x ) = 1 if and only if x is the full member of set A i , j .
Fuzzification of initial data can be conducted using various methods. Fuzzy c-means (FCM) clustering is one of the perspective fuzzification approaches [43]. The FCM algorithm assigns each instance to several clusters with some partition degrees. The partition degree u k , j is fuzzy membership function. These clusters are interpreted as linguistic terms of fuzzy attribute A i and the partition degrees u k , j are interpreted as values of instances to linguistic terms A i , j of these attributes.
As result, data which can be used by fuzzy classifiers have been obtained after fuzzification.

3.3. Fuzzy Decision Tree Induction

Fuzzy decision trees (FDT) are the extension and development of traditional decision trees by fuzzy logic. The essential practical difference between FDT and DT can be seen in the classification of a new instance. In both types of these trees, a classified instance passes down the tree until the instance comes to the leaf. In the traditional decision tree, the classification result is determined by one leaf only. The new instance can pass down the tree by multiple branches with corresponding membership degrees in the case of FDT. Therefore, the classification results by FDT are calculated by several leaves.
There are several algorithms for FDTs building [44,45,46]. Information measure based on Cumulative Mutual Information (CMI) is used in our FDT algorithm. CMI in target attribute B based on knowledge about input attribute A i q and the sequence of values U q 1 of previous input attributes has been introduced in [46] as follows:
I ( B ; U q 1 , A i q ) = j q = 1 m i q j = 1 m b ( M ( B j × U q 1 × A i q , j q ) × ( log 2 M ( B j × U q 1 × A i q , j q ) + log 2 M ( U q 1 ) log 2 M ( B j × U q 1 ) log 2 M ( U q 1 × A i q , j q ) ) )
where U q 1 = { A i 1 , j 1 × × A i q 1 , j q 1 } is the fuzzy set defined by the sequence of fuzzy terms A i 1 , j 1 × × A i q 1 , j q 1 of selected attributes A i 1 , , A i q 1 from the root to the q-th node; M ( B j × U q 1 × A i q , j q ) is a measure of cardinality of fuzzy set B j × U q 1 × A i q , j q .
The criterion that selects splitting attribute for node of the FDT is defined as follows:
i q = argmax ( I ( B ; U q 1 , A i q ) / H ( A i q | U q 1 ) ) ,
where function argmax returns attribute index i q with the maximal value; H ( A i q | U q 1 ) is the cumulative conditional entropy. This entropy is defined between fuzzy attribute A i q and the sequence of selected attribute terms U q 1 as follows:
H   ( A i q | U q 1 ) = j = 1 m i q M ( A i q , j , U q 1 ) × ( log 2 ( M ( U q 1 ) ) log 2 M ( A i q , j × U q 1 ) )
Dividing CMI (1) by this entropy (3) in (2) can eliminate an important drawback of CMI. The criterion of CMI tends to prefer a choice of a splitting attribute with a large set of linguistic values. The entropy (3) use allows solving this problem by taking into account the number of branches that would be created after the split. To avoid overfitting of initial data, the FDT induction is stopped in two cases:
(a)
If the confidence degree bj of the analyzed node is bigger than a priory chosen parameter β. This confidence degree bj reflects the confidence of the decision that the target attribute belongs to the j-th class. This degree can be calculated as:
b j = M ( B j × U q 1 × A i q , j q ) M ( U q 1 × A i q , j q ) ;
(b)
If frequency f ( U q ) of the branch defined by the sequence of fuzzy terms { A i 1 , j 1 × × A i q 1 , j q 1 } = U q is less or equal to a priory chosen parameter α. Frequency f ( U q ) can be calculated as follows:
f ( U q ) = M ( U q 1 × A i q , j q ) / L , ;
where L denotes the number of instances in the dataset.
Two pre-pruning parameters α and β are used in this algorithm for FDT building. According to these parameters, the algorithm can stop FDT induction in a branch. If the frequency of the branch is less than the value of α, FDT building is stopped in this branch. The second parameter β expresses sufficient confidence degree to the classes in the node. If at least one of the confidences bj to j-th class in the node is bigger than the parameter β, then the FDT building in this node is terminated and the node becomes a leaf.
The FDT obtained for the prediction of COVID-19 positive patients is shown in Figure 2. This FDT consists of 22 leaves. The detailed steps and examples of the FDT building are in [46,47]. The FDT in Figure 2 consists of two kinds of nodes: leaves and decision nodes. Each leaf and decision node has three rows. In the decision nodes, the first row reports the name of the associated splitting input attribute. In the leaves, the first row reports the target class. The following two rows have identical meanings for both kinds of nodes. The second row shows the frequency (percentage of covered instances) of the branch which comes to the node. The third row reports the membership degrees to two target classes (the first number—Survived, second number—Died). The labels associated with branches of the tree are:
(a)
For numeric attributes: the values of centroids obtained after fuzzification by FCM. Branches with the label NaN cover missing values.
(b)
For linguistic attributes, these labels represent names of linguistic terms.

3.4. Fuzzy Random Forest

Fuzzy Random Forest (FRF) is a new variant of random forest that is based on bagging [48] and random attribute selection. FRF extends traditional algorithms of the random forest by fuzzy logic. The number of trees and the number of randomly selected attributes for each split should be defined as an input parameter. The splitting criterion of the trees in the forest is based on CMI. FRF consists of a defined number of FDTs. Each of these FDT provides a classification result in form of membership degrees to individual classes. The decision of the forest is obtained by a combination of decisions of individual FDT (Figure 3). This combination is achieved by the summation of membership degrees for each class obtained by each tree in the forest. The resulting membership degrees are obtained by the division of summed memberships by the number of trees of the forest.
The number of FDTs in the forest should be defined as an input parameter. We used an iterative procedure to find the smallest number of the FDTs in combination with the largest classification accuracy. This procedure starts FRF with three FDTs. At the start of each iteration, the number of FDTs is incremented by 1, and the accuracy of new FRF is evaluated. This procedure continues until the classification accuracy stops raising.
To evaluate of the proposed method, we compared it with the well-known fuzzy and crisp classification algorithms. The next section shows evaluation metrics used to compare classification algorithms.

4. The Evaluation of Classification

The results of a prediction in our task are two variants. Therefore, the evaluation of the proposed method can be based on a confusion matrix for binary classification evaluation. In predictive analysis, this matrix has two rows and two columns that contain four class values: the number of True Positive (TP, correctly classified as positive), False Positive (FP, incorrectly classified as positive), False Negative (FN, incorrectly classified as negative), and True Negative (TN, correctly classified as negative) [49].
In our case, the variants True and False should be interpreted as a correct and incorrect prognosis for new instances. On the other hand, the variants Positive and Negative correspond to situations of patient Died and patient Survived, respectively. So, a situation in which a patient has obtained a prognosis of mort is referred to class TP (if the patient really died), and it belongs to the FP class (if the patient survived). Similar, a situation in which a patient has obtained a prognosis of survival is referred to class TN (if the patient really survival), and it belongs to the FN (if the patient will have died).
These four situations agree with all possible prediction results. Their usage allows a more detailed analysis than a calculation of the simple proportion of correct predictions. For evaluation of the proposed method, we used the following metrics that are computable according to the confusion matrix: Accuracy, Specificity, Sensitivity, Balanced accuracy, Precision, F1-score, Matthews correlation coefficient, Youden’s J statistic, Negative Predictive Values, and Diagnostic Odds Ratio.

4.1. Accuracy

The Accuracy is considered as the base and simple metric for the evaluation of prediction (classification) performance. It is calculated as the ratio of correctly predicted instances to all predicted ones. The accuracy is calculated as:
A c c = TP + TN TP + TN + FP + FN .
The value of accuracy is between 0 and 1. The perfect accuracy is represented with value 1. The value equal to 0 represents the worst possible accuracy of prediction when each instance is predicted to the wrong class. This metric suffers from biased results in the case of unbalanced data (the number of instances in classes is different). Therefore, it is often combined with other metrics.

4.2. Specificity

Specificity informs the proportion that patients have died is initial correctly identified. It describes the proportion of those who received an optimistic prognosis survived of those who do actually survive. Specificity closely relates to the algorithm’s ability to correctly select survived patients. Mathematically, this can also be written as:
S p e c = TN TN + FP .
The value of specificity is between 0 and 1. The perfect specificity is represented by 1. In this case, each survives patient had the initial correct prognosis as Survive. The worst value of specificity is equal to 0. This value identifies a situation when a lot of survived patients with initial incorrect prognoses are dominant in our prediction.
Prediction algorithms with high specificity are useful for the evaluation of the perspective in disease. A prediction of the patient will have died, signifies a high probability of such result in this case.

4.3. Sensitivity

Sensitivity or Recall is a probability of a correct initial prognoses about patient death. Sensitivity refers to the prediction ability to correctly detect a death of patients who have a critical stage. Mathematically, this is expressed as:
S e n s = TP TP + FN .
The value of sensitivity is between 0 and 1. The perfect sensitivity is represented by 1. In this case, each dead patient had a priory correctly prognosis to such result. The worst value of sensitivity is equal to 0. This value identifies situations when each patient a priory classified as potential Survive has died.
A prognosis “Survived” which is obtained by a prediction algorithm with high sensitivity is useful in the case of optimistic disease perspective. Such algorithms rarely misdiagnose patients with a good perspective of disease. A prediction algorithm with 100% sensitivity will recognize all patients with the non-optimistic potential in advance.

4.4. Balanced Accuracy

Accuracy is considered as the fundamental metric in classifier evaluation. However, this metric is not suitable when the classes are imbalanced, i.e., one of the two classes appears a lot more often than the other. This happens often in many cases such as anomaly detection. In such a situation, a Balanced Accuracy (Bacc) is more suitable. This metric can be calculated as the arithmetic mean of specificity and sensitivity:
B a c c = S e n s + S p e c 2 .
The values of balanced accuracy are set in interval [0, 1]. The bigger value of this metric indicates that the classifier can assign instances to the target class with bigger accuracy with respect to the size of the classes. Balanced accuracy is a metric that one can use when evaluating how good a binary classifier is.

4.5. Precision

Precision defines the probability that an instance predicted as died is really dead. Therefore, precision represents a proportion of instances with die’s prognoses in a total of instances of dead patients. It can be calculated as:
P r e c = TP TP + FP .
The value of precision is between 0 and 1. The bigger value represents the bigger probability that instance foretold as died is really dead.

4.6. F1-Score

The F-score is a commonly used statistics metric from information retrieval. This metric is the harmonic mean of sensitivity and precision.
F s c = 2 TP 2 TP + FP + FN .
F-score, like sensitivity and precision, only considers the so-called predictions of death, with sensitivity being the probability of predicting just the class (event) of death, and precision being the probability of a prediction of died being correct. F-score is equating these probabilities under the effective assumption that the real death situation and the death predictions should have the same distribution and prevalence.
The highest possible value of an F-score is 1.0, indicating perfect precision and sensitivity. What is more, the lowest possible value is 0 if either the precision or the sensitivity is zero.
Considered metrics (Sensitivity, Precision, and F1-score) completely ignore instances with correct made prognoses of Survived. Matthews correlation coefficient and Youden’s index are simple and efficient classification metrics that are more relevant for classifiers if you analyze Survived prognoses (True Negative).

4.7. Matthews Correlation Coefficient

The Matthews correlation coefficient (MCC) is a correlation coefficient between the observed and predicted classifications:
M C C = TP TN FP FN ( TP + FP ) ( TP + FN ) ( TN + FP ) ( TN + FN ) .
This coefficient takes into account correct and incorrect predictions of death and survival. MCC is generally regarded as a balanced measure that can be used even if the classes are of very different sizes. The values obtained by this metrics are between −1 and +1. The value of MCC equal to +1 represents a perfect prediction; 0 is no better than a random prediction, and −1 indicates total disagreement between prediction and observation. MCC is widely used in the field of Bioinformatics and Machine Learning.

4.8. Youden’s J Statistic

Youden’s J statistic (YJs) is the probability of an informed decision (as opposed to a random guess) and takes into account all predictions. This metric is summarizing the performance of a prediction algorithm and is calculated by the next rule:
Y J s = S e n s + S p e c 1
The value of this metric ranges from 0 through 1, and has a 0 value when a prediction algorithm gives the same proportion of results with died prognosis for both variants prognoses: as died and survival values. In this case, the prediction algorithm is useless. A value of 1 indicates that there are no incorrect predictions, i.e., the prediction algorithm is perfect. The metric gives equal weight to incorrect prediction in both variants: as died and survival values. So, all prediction algorithms with the same value of the metric give the same proportion of total mispredicted results.

4.9. Positive and Negative Predictive Values

The Positive (PPV) and Negative Predictive Values (NPV) are the proportions of results with prognoses died and survived that are really died and survived results, respectively. The PPV and NPV describe the performance of a prediction algorithm. A high result can be interpreted as indicating the accuracy of the algorithm. The PPV and NPV are not intrinsic to the algorithm only. They depend also on the prevalence [50]. Both PPV and NPV can be derived using Bayes’ theorem. PPV is often called Precision which has been described in this paper early. The NPV is defined as:
N P V = TN TN + FN .
In our case, a True Negative is an event that the prediction algorithm makes a prognosis of a patient’s survival and this prognosis is correct. The False Negative is the event that an algorithm makes the prognosis of a patient’s survival but the patent is died. With a perfect algorithm, one which returns no false survival prognosis, the value of the NPV is 1, and with an algorithm that returns a lot of incorrect prognoses of surviving the NPV value is 0.

4.10. Diagnostic Odds Ratio

The Diagnostic Odds Ratio (DOR) describes the odds of a prediction of died relative to the odds of surviving:
D O R = TP / FN FP / TN .
Thus, this measure includes information about both sensitivity and specificity and tends to be reasonably constant. The rationale for the diagnostic odds ratio application is that it is a single indicator of prognostic algorithm performance (like accuracy and Youden’s J statistic), but which is independent of prevalence (unlike accuracy) and is presented as an odds ratio, which is familiar to medical practitioners.
The DOR depends significantly on the sensitivity and specificity of an algorithm. A prediction algorithm with high specificity and sensitivity with low rates of False Positive and False Negative has a high DOR. With the same sensitivity of the prediction algorithm, the DOR increases with the increase in the algorithm’s specificity. The DOR does not depend on a number of instances [51].
The metrics described in this section were used to evacuate the proposed method to predicting the survival of patients with COVID-19 and, also, to compare this method with existing ones. The description and results of the evaluation and comparison are shown in the following section.

5. Accuracy Analysis of Inducted Fuzzy Classifiers

The main novelty of the suggested fuzzy method for survival prediction of COVID-19 patients is the addition of fuzzification as the preprocessing procedure. The advantage of fuzzified data is obvious especially when we cannot take an exact decision in a borderline situation. For example, it can be seen when we need to decide if a person is young or old, but the person is 40 years old. We cannot say that this person is totally young or totally old. The fuzzy logic can assign this person to both sets: young and old people with some membership degrees.
The effectiveness of this proposed method is calculated in the comparison with other existed prediction algorithms. The comparative analysis of described algorithms is implemented based on dataset introduced in [38] for survival prediction of COVID-19 positive patients. The data are described by 29 input attributes and one target attribute (Survival). This dataset consists of 177 instances. The prediction was evaluated by cross validation technique. Cross validation is a resampling validation technique used to evaluate decision models on a finite dataset. This technique has one parameter that refers to the number of groups that a given dataset is to be split into. These groups are used for cratering training and testing dataset. The first group is used to validate trained model and the rest of the groups is used to train the model. When the first validation is finished, then the second group will be used for validation and the first and the rest of the groups will be used for training. This is repeating until all groups are used for validation. When all groups are used for validation, the cross validation is finished, and the results of all validations are averaged. Therefore, we obtained testing dataset of one sample and training dataset consisting of 17 samples. So, we trained each model 177 times and then we classified one instance which was not included in training dataset with each model. This setting of cross validation is denoted in literature as leave one out cross validation.
The models used in our analysis have several input parameters. The values of these parameters was established experimentally. We iteratively run leave one out cross validation for each model with some configuration of input parameters. After each iteration the value of one parameter was changed and the cross validation was conducted again. This was repeated until all necessary combination of input parameters was used. The number of iterations was different for each algorithm. For example, Naive Bayes had only two iterations because this algorithm has only one input parameter (usage of Laplace correlation) while deep learning algorithm runs several hundred times because it has dozens of parameters. On the other hand, FDT has two input parameters α and β. We analyzed α from 0.0 to 0.2 (by step 0.001) and β from 0.75 to 1.0 (by step 0.001). For each combination (defined by steps) of these parameters, the leave one out cross validation was estimated and the configuration with best classification accuracy was selected and shown in the comparison (Figure 4). In similar way, we estimated parameters for each algorithm included in comparison.
The experiments performed in this paper were evaluated for crisp and fuzzy classifiers. The crisp classifiers included in this comparison are Naïve Bayes (NB) [13], Decision Tree (C4.5) [3], Artificial Neural Network (ANN) [16], k-nearest neighborhood (kNN) [52], Linear Regression (LR) [19], Support Vector Machine (SVM) [52], and Random Forest (RF) [53]. The experiments with crisp classifiers we performed in Rapid Miner. The fuzzy classifiers analyzed in this paper use the data preprocessing named fuzzification. The fuzzification model was created according to training data only. Later, this model was used to fuzzify the testing data. This allows an unbiased comparison of fuzzy and crisp classifiers. In comparative analysis used next fuzzy classifiers: Fuzzy Naïve Bayes classifier (FNB) [53], Fuzzy Multi-Layer Perceptron (FMLP) [54], FDT according to the algorithm in [46], and Fuzzy Random Forest (FRF) [54]. The two last algorithms were proposed to be the authors of this paper. The fuzzy classifiers were implemented in MATLAB.
The results of evaluation metrics have been obtained for each chosen classifier. These results are shown in Table 1. The best variants are marked by the bold font. The top part of the table reports prediction results of crisp classifiers while the down part of this Table shows results of four fuzzy classifiers.
The analysis and its results indicate efficiency of the fuzzy based method for survival prediction of COVID-19 patients. This result is illustrated by the comparison of crisp and fuzzy classifiers which are based on the same initial data. The method based on FDT has the best classification accuracy (0.848), specificity (0.563), precision and F1-score (both 0.921), and other metric. Other metrics are best for FDT too. Need to note, that some efficiency of classification should be considered according all of metrics in Table 1. For example, k-NN has some zero metrics, which are computed for the k-NN classifier of the best accuracy. In this case, the kNN tends to classify almost all instances in one class as evidenced by Spec = 0 and Sens = 0.989. Therefore, metrics YJs and NPV are zero.
Moreover, in case of FDT, we can analyze the decision-making process in easy way. It can be very important, especially in medical areas, when the doctors cannot make decisions according to black boxes. The FDT allow visual analysis of the obtained results and hence it is possible to see what the decision model is doing during classification. Therefore, FDT with this accuracy and its interpretability can be considered as very strong tools in survival prediction. Need to note, deep learning is also affected by many input parameters (number of hidden layers, neurons per layer, activation functions, learning rate, optimizers, etc.). To find ideal values of these parameters is not simple process.

6. Discussion

This paper proposes the new fuzzy method for the survival prediction of COVID-19 positive patients. This method is the evolution of the classification method, which includes data preprocessing. The proposed method includes the data preprocessing procedure and the fuzzy classification (Figure 1). The preprocessing transform initial data into fuzzy attributes for the classification and thus, it is possible to take into account the possible (hidden) ambiguity of the initial data. The classification of these attributes is implemented by the fuzzy classifier. Since one of the conditions for the developed method was good interpretability of the result, we used decision tree based classifier, in particular, FDT and FRF. These classifiers (FDT and FRF) are used in decision-making of timing of tracheostomy in COVID-19 patients. We created fuzzy classifiers based on CMI. The CMI based method of FDT induction has been considered in detail in [46].
We compared the inducted classifier (FDT and FRF) with other well-known crisp and fuzzy classifiers (Table 1). We used a dataset collected from a real record of COVID-19 patients gathered in ICU at Guy’s and St Thomas’ National Health Service (NHS) Foundation Trust [38]. This dataset has 177 records of COVID-19 patients and 29 input attributes. It can be concluded that the FDT is best according to the analysis of the set of metrics for assessing classifiers (Table 1). Need to note that the considered fuzzy classifiers (FNB, FMLP, FDT, and FRF) have best result of classification and prediction in comparison with the crisp classifiers according to the metrics in Table 1. It allows us to conclude that the introduction of procedure of the preprocessing (fuzzification) of initial data and the use of fuzzy classifier result the improving of the prediction of timing of tracheostomy in COVID-19 patients.
FDT as decision tree based has good verbalization and interpretation in comparison, for example, with neural network [44,45]. A decision tree is a hierarchical structure that reflects the options for making decisions for each specific situation, as well as the possible results of each action. This approach is especially useful when a series of sequential decisions must be made, or when multiple outcomes may arise at each stage of the decision-making process. FDT and decision trees allows for analyzing the root cause of each outcome and trace the path from the end state to initiation in chronological order and the relationship of events. In addition, need to note that the FDT has best interpretation and very easily transforms into decision rules for the elaboration of decision-making of timing of tracheostomy in COVID-19 patients [47]. The common structure of decision rule is If <condition> Then <consequent>. The number of decision rules is equaled to the number of FDT leaves. So, each leaf of FDT corresponds to one decision rule. One of the possible transformation of FDT into the fuzzy decision rules has been considered in study [47] in details.
The presented result has been obtained for specific problem and dataset. Therefore, the approbation of this method should be implemented for another datasets to prove the efficiency of the fuzzy classifier use instead of crisp classifier in problems in which the initial data at first glance have no uncertainty.

7. Conclusions

The proposed method (Figure 1) has been approbated in decision-making of timing of tracheostomy in COVID-19 patients. However, this method can be used in other decision-making based problem, where, on the basis of collected data, it is necessary to obtain a prediction of a new situation. The specificity of the proposed method is the use of fuzzy classifier for any data, including the crisp data. The implemented experiments illustrate that the adding the procedure of the fuzzification in the data preprocessing and the use of fuzzy classifier allow increasing the efficiency of the decision-making procedure: it follows from the comparison of result evaluation for FDT and decision tree inducted based on C4.5 algorithm (see Table 1).
The application of this method in other in decision-making based problem in healthcare domain will be implanted in in future research.

Author Contributions

Conceptualization, J.R., E.Z. and V.L.; methodology, J.R., E.Z. and V.L.; software, J.R. and M.K.; validation, M.K. and D.M.; formal analysis, M.K.; investigation, J.R., E.Z., V.L., M.K., P.S. and D.M.; resources, P.S.; data curation, P.S. and D.M.; writing—original draft preparation, E.Z., P.S. and D.M.; writing—review and editing, M.K. and P.S.; visualization, J.R.; supervision, E.Z. and V.L.; project administration, J.R.; funding acquisition, V.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Slovak research and development agency, grant number APVV PP-COVID-20-0013 “Development of methods of healthcare system risk and reliability evaluation under coronavirus outbreak” and the integrated infrastructure operational program co-financed by the European regional development fund, grant number IMTS: 313011V446, “Integrative strategy in development of personalized medicine of selected malignant tumours and its impact on quality of life”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

In the experimental investigation the data from the paper A. Takhar, P. Surda, I. Ahmad, et al., “Timing of Tracheostomy for Prolonged Respiratory Wean in Critically Ill Coronavirus Disease 2019 Patients: A Machine Learning Approach,” Critical Care Explorations, vol. 2, no. 1, e0279, November 2020, doi: 10.1097/CCE.0000000000000279 has been used.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Parimbelli, E.; Marini, S.; Sacchi, L.; Bellazzi, R. Patient similarity for precision medicine: A systematic review. J. Biomed. Inform. 2018, 83, 87–96. [Google Scholar] [CrossRef] [PubMed]
  2. Ye, J.; Yao, L.; Shen, J.; Janarthanam, R.; Luo, Y. Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes. BMC Med. Inform. Decis. Mak. 2020, 20, 295. [Google Scholar] [CrossRef] [PubMed]
  3. Feng, Y.N.; Xu, Z.H.; Liu, J.T.; Sun, X.L.; Wang, D.Q.; Yu, Y. Intelligent prediction of RBC demand in trauma patients using decision tree methods. Mil. Med. Res. 2021, 8, 33. [Google Scholar] [CrossRef]
  4. Arab, S.; Rezaee, K.; Moghaddam, G. A novel fuzzy expert system design to assist with peptic ulcer disease diagnosis. Cogent Eng. 2021, 8, 1861730. [Google Scholar] [CrossRef]
  5. Lee, T.C.; Shah, N.U.; Haack, A.; Baxter, S.L. Clinical Implementation of Predictive Models Embedded within Electronic Health Record Systems: A Systematic Review. Informatics 2020, 7, 25. [Google Scholar] [CrossRef]
  6. Latif, J.; Xiao, C.; Tu, S.; Rehman, S.U.; Imran, A.; Bilal, A. Implementation and Use of Disease Diagnosis Systems for Electronic Medical Records Based on Machine Learning: A Complete Review. IEEE Access 2020, 8, 150489–150513. [Google Scholar] [CrossRef]
  7. Rostamzadeh, N.; Abdullah, S.S.; Sedig, K.; Garg, A.X.; McArthur, E. VERONICA: Visual Analytics for Identifying Feature Groups in Disease Classification. Information 2021, 12, 344. [Google Scholar] [CrossRef]
  8. Suri, J.S.; Puvvula, A.; Biswas, M.; Majhail, M.; Saba, L.; Faa, G.; Singh, I.M.; Oberleitner, R.; Turk, M.; Chadha, P.S.; et al. COVID-19 pathways for brain and heart injury in comorbidity patients: A role of medical imaging and artificial intelligence-based COVID severity classification: A review. Comput. Biol. Med. 2020, 124, 103960. [Google Scholar] [CrossRef]
  9. Li, H.; Wu, T.T.; Yang, D.L.; Guo, Y.S.; Liu, P.C.; Chen, Y.; Xiao, L.P. Decision tree model for predicting in-hospital cardiac arrest among patients admitted with acute coronary syndrome. Clin. Cardiol. 2019, 42, 1087–1093. [Google Scholar] [CrossRef] [PubMed]
  10. Yu, G.; Chen, Z.; Wu, J.; Tan, Y. Medical decision support system for cancer treatment in precision medicine in developing countries. Expert Syst. Appl. 2021, 186, 115725. [Google Scholar] [CrossRef]
  11. Isci, S.; Kalender, D.S.Y.; Bayraktar, F.; Yaman, A. Machine Learning Models for Classification of Cushing’s Syndrome Using Retrospective Data. IEEE J. Biomed. Health Inform. 2021, 25, 3153–3162. [Google Scholar] [CrossRef]
  12. Arora, P.; Boyne, D.; Slater, J.; Gupta, A.; Brenner, D.R.; Druzdzel, M.J. Bayesian Networks for Risk Prediction Using Real-World Data: A Tool for Precision Medicine. Value Health 2019, 22, 439–445. [Google Scholar] [CrossRef] [Green Version]
  13. Sweetlin, E.J.; Ponraj, D.N. Classification of metabric clinical dataset using Naive Bayes classifier. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 4834–4837. [Google Scholar]
  14. Li, J.; Tian, Y.; Zhu, Y.; Zhou, T.; Li, J.; Ding, K.; Li, J. A multicenter random forest model for effective prognosis prediction in collaborative clinical research network. Artif. Intell. Med. 2020, 103, 101814. [Google Scholar] [CrossRef] [PubMed]
  15. Govindarajan, P.; Soundarapandian, R.K.; Gandomi, A.H.; Patan, R.; Jayaraman, P.; Manikandan, R. Classification of stroke disease using machine learning algorithms. Neural Comput. Appl. 2020, 32, 817–828. [Google Scholar] [CrossRef]
  16. Nikkonen, S.; Korkalainen, H.; Leino, A.; Myllymaa, S.; Duce, B.; Leppanen, T.; Toyras, J. Automatic Respiratory Event Scoring in Obstructive Sleep Apnea Using a Long Short-Term Memory Neural Network. IEEE J. Biomed. Health Inform. 2021, 25, 2917–2927. [Google Scholar] [CrossRef]
  17. Dey, S.; Bhattacharya, R.; Malakar, S.; Mirjalili, S.; Sarkar, R. Choquet fuzzy integral-based classifier ensemble technique for COVID-19 detection. Comput. Biol. Med. 2021, 1, 104585. [Google Scholar] [CrossRef] [PubMed]
  18. Shaban, W.M.; Rabie, A.H.; Saleh, A.I.; Abo-Elsoud, M.A. A new COVID-19 Patients Detection Strategy (CPDS) based on hybrid feature selection and enhanced KNN classifier. Knowl. Based Syst. 2020, 205, 106270. [Google Scholar] [CrossRef]
  19. Dubey, A.K.; Narang, S.; Kumar, A.; Sasubilli, S.M.; García-Díaz, V. Performance Estimation of Machine Learning Algorithms in the Factor Analysis of COVID-19 Dataset. Comput. Mater. Contin. 2021, 66, 1921–1936. [Google Scholar] [CrossRef]
  20. Meyer, A.; Cypko, M.A.; Eickhoff, C.; Falk, V.; Emmert, M.Y. Artificial intelligence-assisted care in medicine: A revolution or yet another blunt weapon? Potentials, challenges, and the future of implementing artificial intelligence (AI) for clinical care. Eur. Heart J. 2019, 40, 3286–3289. [Google Scholar] [CrossRef] [PubMed]
  21. Khan, S.H.; Sohail, A.; Khan, A.; Hassan, M.; Lee, Y.S.; Alam, J.; Basit, A.; Zubair, S. COVID-19 detection in chest X-ray images using deep boosted hybrid learning. Comput. Biol. Med. 2021, 137, 104816. [Google Scholar] [CrossRef]
  22. Tavolara, T.E.; Gurcan, M.N.; Segal, S.; Niazi, M.K.K. Identification of difficult to intubate patients from frontal face images using an ensemble of deep learning models. Comput. Biol. Med. 2021, 136, 104737. [Google Scholar] [CrossRef] [PubMed]
  23. Yap, M.H.; Hachiuma, R.; Alavi, A.; Brüngel, R.; Cassidy, B.; Goyal, M.; Zhu, H.; Rückert, J.; Olshansky, M.; Huang, X.; et al. Deep learning in diabetic foot ulcers detection: A comprehensive evaluation. Comput. Biol. Med. 2021, 135, 104596. [Google Scholar] [CrossRef] [PubMed]
  24. Abdulrahman, S.A.; Khalifa, W.; Roushdy, M.; Salem, A.-B.M. Comparative study for 8 computational intelligence algorithms for human identification. Comput. Sci. Rev. 2020, 36, 100237. [Google Scholar] [CrossRef]
  25. Alabi, R.O.; Youssef, O.; Pirinen, M.; Elmusrati, M.; Mäkitie, A.A.; Leivo, I.; Almangush, A. Machine learning in oral squamous cell carcinoma: Current status, clinical concerns and prospects for future—A systematic review. Artif. Intell. Med. 2021, 115, 102060. [Google Scholar] [CrossRef] [PubMed]
  26. Sosnowski, Z.A.; Gadomer, L. Fuzzy trees and forests—Review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 6, 1316. [Google Scholar] [CrossRef]
  27. Codish, S.; Shiffman, R. A model of ambiguity and vagueness in clinical practice guideline recommendations. In Proceedings of the American Medical Informatics Association Annual Symposium, Washington, DC, USA, 22–26 October 2005; pp. 146–150. [Google Scholar]
  28. Lukersmith, S.; Taylor, J.; Salvador-Carulla, L. Vagueness and Ambiguity in Communication of Case Management: A Content Analysis in the Australian National Disability Insurance Scheme. Int. J. Integr. Care 2021, 21, 17. [Google Scholar] [CrossRef]
  29. Hofmann, B. Vagueness in Medicine: On Disciplinary Indistinctness, Fuzzy Phenomena, Vague Concepts, Uncertain Knowledge, and Fact-Value-Interaction. Axiomathes 2021, 31, 1–18. [Google Scholar] [CrossRef]
  30. Chorev, N.E. Data ambiguity and clinical decision making: A qualitative case study of the use of predictive information technologies in a personalized cancer clinical trial. Health Inform. J. 2019, 25, 500–510. [Google Scholar] [CrossRef]
  31. Seising, R. From vagueness in medical thought to the foundations of fuzzy reasoning in medical diagnosis. Artif. Intell. Med. 2006, 38, 237–256. [Google Scholar] [CrossRef]
  32. Bocklisch, F.; Bocklisch, S.F.; Beggiato, M.; Krems, J.F. Adaptive fuzzy pattern classification for the online detection of driver lane change intention. Neurocomputing 2017, 262, 148–158. [Google Scholar] [CrossRef]
  33. Jimenez, F.; Martinez, C.; Marzano, E.; Palma, J.; Sanchez, G.; Sciavicco, G. Multiobjective Evolutionary Feature Selection for Fuzzy Classification. IEEE Trans. Fuzzy Syst. 2019, 27, 1085–1099. [Google Scholar] [CrossRef]
  34. Lee, E.H.; Zheng, J.; Colak, E.; Mohammadzadeh, M.; Houshmand, G.; Bevins, N.; Kitamura, F.; Altinmakas, E.; Reis, E.P.; Kim, J.-K.; et al. Deep COVID DeteCT: An international experience on COVID-19 lung detection and prognosis using chest CT. NPJ Digit. Med. 2021, 4, 11. [Google Scholar] [CrossRef]
  35. Jin, W.; Dong, S.; Dong, C.; Ye, X. Hybrid ensemble model for differential diagnosis between COVID-19 and common viral pneumonia by chest X-ray radiograph. Comput. Biol. Med. 2021, 131, 104252. [Google Scholar] [CrossRef] [PubMed]
  36. Brunese, L.; Mercaldo, F.; Reginelli, A.; Santone, A. Explainable Deep Learning for Pulmonary Disease and Coronavirus COVID-19 Detection from X-rays. Comput. Methods Programs Biomed. 2020, 196, 105608. [Google Scholar] [CrossRef]
  37. Marques, G.; Agarwal, D.; Diez, I. Automated Medical Diagnosis of COVID-19 through EfficientNet Convolutional Neural Network. Appl. Soft Comput. 2020, 96, 106691. [Google Scholar] [CrossRef] [PubMed]
  38. Takhar, A.; Surda, P.; Ahmad, I.; Amin, N.; Arora, A.; Camporota, L.; Denniston, P.; El-Boghdadly, K.; Kvassay, M.; Macekova, D.; et al. Timing of Tracheostomy for Prolonged Respiratory Wean in Critically Ill Coronavirus Disease 2019 Patients: A Machine Learning Approach. Crit. Care Explor. 2020, 2, e0279. [Google Scholar] [CrossRef]
  39. Rabcan, J.; Levashenko, V.; Zaitseva, E.; Kvassay, M. Review of Methods for EEG Signal Classification and Development of New Fuzzy Classification-Based Approach. IEEE Access 2020, 8, 189720–189734. [Google Scholar] [CrossRef]
  40. Batur Sir, G.D.; Sir, E. Pain Treatment Evaluation in COVID-19 Patients with Hesitant Fuzzy Linguistic Multicriteria Decision-Making. J. Healthc. Eng. 2021, 2021, 8831114. [Google Scholar] [CrossRef] [PubMed]
  41. Levashenko, V.; Rabcan, J.; Zaitseva, E. Reliability Evaluation of the Factors That Influenced COVID-19 Patients’ Condition. Appl. Sci. 2021, 11, 2589. [Google Scholar] [CrossRef]
  42. Shaban, W.M.; Rabie, A.H.; Saleh, A.I.; Abo-Elsoud, M.A. Detecting COVID-19 Patients based on Fuzzy Inference Engine and Deep Neural Network. Appl. Soft Comput. 2021, 99, 106906. [Google Scholar] [CrossRef] [PubMed]
  43. Pedrycz, W. Knowledge-Based Clustering: From Data to Information Granules; Wiley: Hoboken, NJ, USA, 2005. [Google Scholar]
  44. Olaru, C.; Wehenkel, L. A complete fuzzy decision tree technique. Fuzzy Sets Syst. 2003, 138, 221–254. [Google Scholar] [CrossRef] [Green Version]
  45. Jin, C.; Li, F.; Li, Y. A generalized fuzzy ID3 algorithm using generalized information entropy. Knowl.-Based Syst. 2014, 64, 13–21. [Google Scholar] [CrossRef]
  46. Androulidakis, I.; Levashenko, V.; Zaitseva, E. An Empirical Study on Green Practices of Mobile Phone Users. Wirel. Netw. 2016, 22, 2203–2220. [Google Scholar] [CrossRef]
  47. Zaitseva, E.; Levashenko, V. Construction of a Reliability Structure Function based on Uncertain data. IEEE Tran Reliab. 2016, 65, 1710–1723. [Google Scholar] [CrossRef]
  48. Bonissone, P.; Cadenas, J.M.; Garrido, M.C.; Díaz-Valladares, R.A. A Fuzzy Random Forest. Int. J. Approx. Reason. 2010, 51, 729–747. [Google Scholar] [CrossRef] [Green Version]
  49. Kaur, A.; Kaur, I. An Empirical Evaluation of Classification Algorithms for Fault Prediction in Open Source Projects. J. King Saud Univ.-Comput. Inf. Sci. 2018, 30, 2–17. [Google Scholar] [CrossRef] [Green Version]
  50. Eusebi, P. Diagnostic Accuracy Measures. Cerebrovasc. Dis. 2013, 36, 267–272. [Google Scholar] [CrossRef] [PubMed]
  51. Glas, A.S.; Lijmer, J.G.; Prins, M.H.; Bonsel, G.J.; Bossuyt, P.M.M. The diagnostic odds ratio: A single indicator of test performance. J. Clin. Epidemiol. 2003, 56, 1129–1135. [Google Scholar] [CrossRef]
  52. Niwariya, M.; Rajput, A.; Jaloree, S. Data Mining Approach for Diabetes Prediction using BPSO, SVM, KNN and naïve Bayes classifiers. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 286–293. [Google Scholar] [CrossRef]
  53. Bustamante, C.; Garrido, L.; Soto, R. Comparing Fuzzy Naive Bayes and Gaussian Naive Bayes for Decision Making in RoboCup 3D. In Proceedings of the Mexican International Conference on Artificial Intelligence, Apizaco, Mexico, 13–17 November 2006; pp. 237–247. [Google Scholar]
  54. De Carvalho, L.; Nassar, S.M.; De Azevedo, F.M. A Neuro-Fuzzy System to Support in the Diagnostic of Epileptic Events and Non-Epileptic Events Using Different Fuzzy Arithmetical Operations. Arq. Neuro-Psiquiatr. 2008, 66, 179–183. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. The principal steps of fuzzy classifier induction for decision-making procedure for survival prediction of COVID-19 patients.
Figure 1. The principal steps of fuzzy classifier induction for decision-making procedure for survival prediction of COVID-19 patients.
Mathematics 09 03282 g001
Figure 2. The FDT for survival prediction of COVID-19 patients. The threshold parameters α and β are defined as α = 0.005 and β = 0.950.
Figure 2. The FDT for survival prediction of COVID-19 patients. The threshold parameters α and β are defined as α = 0.005 and β = 0.950.
Mathematics 09 03282 g002
Figure 3. Strategy of classification (prediction) by FRF.
Figure 3. Strategy of classification (prediction) by FRF.
Mathematics 09 03282 g003
Figure 4. The dependency between values of pruning parameters α and β and accuracy of the classification.
Figure 4. The dependency between values of pruning parameters α and β and accuracy of the classification.
Mathematics 09 03282 g004
Table 1. The comparison of different prediction (classification) algorithms.
Table 1. The comparison of different prediction (classification) algorithms.
ClassifierAccSpecSensBaccPrecF1 ScoreMCCYJsNPVDOR
CrispNB0.8160.1430.9210.5320.8720.8960.0780.0640.2221.952
C4.50.8350.3570.910.6340.90.9050.2760.2670.3851.416
ANN0.8450.3570.9210.6390.9010.9110.2970.2780.2786.508
k-NN0.8540.0000.9890.4940.8630.921−0.0390.0000.0005.857
LR0.8350.3570.910.6340.90.9050.2760.2670.3851.416
SVM0.8530.3130.9530.6330.8820.9160.3410.2660.5569.318
RF0.8370.3330.9210.6270.8910.8910.2800.2550.4175.857
FuzzyFNB0.8160.1430.9210.5320.8720.8960.0780.0640.2221.952
FMLP0.8570.5630.910.7360.920.9150.4610.4730.52913.018
FDT0.8670.5630.9210.8670.9210.9210.4840.4840.56315.061
FRF0.8480.5630.8990.7310.920.9090.440.4610.511.429
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Rabcan, J.; Zaitseva, E.; Levashenko, V.; Kvassay, M.; Surda, P.; Macekova, D. Fuzzy Decision Tree Based Method in Decision-Making of COVID-19 Patients’ Treatment. Mathematics 2021, 9, 3282. https://doi.org/10.3390/math9243282

AMA Style

Rabcan J, Zaitseva E, Levashenko V, Kvassay M, Surda P, Macekova D. Fuzzy Decision Tree Based Method in Decision-Making of COVID-19 Patients’ Treatment. Mathematics. 2021; 9(24):3282. https://doi.org/10.3390/math9243282

Chicago/Turabian Style

Rabcan, Jan, Elena Zaitseva, Vitaly Levashenko, Miroslav Kvassay, Pavol Surda, and Denisa Macekova. 2021. "Fuzzy Decision Tree Based Method in Decision-Making of COVID-19 Patients’ Treatment" Mathematics 9, no. 24: 3282. https://doi.org/10.3390/math9243282

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop