Introduction

The arrival of COVID-19 shock the world with its arrival, first detected at Wuhan, Hubei province of China in the month of December. It was first noted as unknown pneumonia clustering in the region and later it spread rapidly and took the shape of a deadly pandemic. Initially, it has been assumed that the transmission is among the animals but soon it was proved that human transmission is also occurring, and from then social distancing became our new normal. The virus's nature is similar to previous attacks of SARS-CoV, and MERS through SARS-CoV2's spreading rate is higher than the former ones but the mortality rate is lower than those. The typical symptoms which are seen in the patients affected by the virus are fever, cough, fatigue, sore through, muscle pain, shortness of breath, etc.

Rapid testing has been suggested across all nations to cut the spread of the virus. The major aim is to separate the infected population part from the susceptible population part. The testing kit is the need of the hour to carry on the tests and one of the most common tests to detect the presence of nCoV is the reverse transcription–polymerase chain reaction (RT–PCR) test. Though the RT–PCR test is extremely costly in nature and major laboratories do not have access to these kits to carry on the test; thus, the speed of rapid testing is declining at a brisk pace. Whereas, on the other hand, the availability of X-ray is widely available across India, and thus, it can be a potential solution for rapid testing if it could be merged with our proposed model.

In this paper, we present an ensemble-based transfer learning model to easily classify the X-ray image into 2 major classes' COVID-19 positive or, COVID-19 negative. The patches start to appear from 0 to 2 days from the onset of the disease, and heavy patches are being created with the passing time. We have proposed a Euclidean average weights method to initialize the weights of the pre-trained models viz. ResNet50, VGG16, VGG19, Xception, and InceptionV3 while making the ensemble model classify the X-ray images with more accuracy. Initially, all the models are given the same importance and all the models are being assigned as \((\mathrm{1,1},\mathrm{1,1},1)\) in the vector space for the ensemble. Minimizing the false-negative count is one of the major aims while designing the classifier, and the ensembled proposed model has done that pretty well is can be shown using the confusion matrix. The pre-trained models are chosen that have a lower number of parameters to make our model computationally fast and economical keeping the accuracy factor in mind.

The images which have been passed through the classifier have been heavily pre-processed to provide the best image quality to maximize the prediction accuracy. At first, all the images were scaled to \((224 \times 224)\) size as those pre-trained models are trained in such dimensions. There was a lack of training data and the data set consisted of around 784 images; thus, we have data augmented the image using DCGAN architecture. The new synthetic images which have been created by the DCGAN architecture have been subsequently passed through as training images, thus diminish the class-imbalance problem for the classifier. The pre-processing followed by noise removal from the image using Gaussian filter model, then we have used shadow removal and Image Enhancement using Canny edge detection and histogram equalization. The lung portion is being segmented to analyze the match with higher sensitivity.

Along with that we have also presented a symptom analysis model. In this section the symptoms of the patients are taken into account viz. fever, tiredness, dry cough, sore throat, no symptoms, muscle pain, age, nasal congestion, running nose, and gender to classify them according to low, medium, and high severity. In this section, the classifier is being optimized using the Genetic Algorithm to get the fittest offspring from a series of a generation. The optimized model showed better results than the former one. We took 10 generations for the iteration and with that 20 offspring per generation has been taken into account for measuring the fitness factor. The proposed model is can be a groundbreaking result as we can easily predict the severity of the patients with the help of symptoms shown by the patients across the day.

The novelty of the paper lies in bi-fold modeling. On one hand, we have proposed a novel ensemble-based transfer learning method, where the proposed architecture surpasses the accuracy metrics with respect to the traditional models. In addition, on the other hand, we have proposed another classification model which can model the severity of a patient from the onset of the symptoms. Along with that, the classifier model has been optimized using the genetic algorithm to tune the accuracy of the model.

The paper is structured as "Literature Review" contains the literature review of the work, followed by ensemble-based proposed model in "Proposed Approach of Ensemble-Based Transfer Learning Model Using Euclidian Weighted Average" and Symptom classification analysis using genetic algorithm are proposed in "Symptom Analysis Using Hybrid Approach of Genetic Algorithm and Classifying Algorithm" and finally concluding remark and future scope of study in "Concluding Remark and Future Scope of Study"

Literature Review

A significant amount of work is being carried out in this field to help the frontline workers to some extent. Radiologists discovered major findings like Kong et al. [1] observed opaque inferior airspaces in the CT images of the lung. Yoon Kong et al. [2] observed opacity in the left lower lung region in nodal structure through CT images. Vascular dilation and several patches have been seen by Zhao et al. [3] with some irregular opacities in the lung region. Li and Xia [4] observed GGO and patches across the lungs of the COVID-19 affected patients, along with that they reported signs of air bronchogram. Along with that vascular expansion is also seen in CT images which became a common factor in the patients and most of them showing similar traits. Zhu et al. [5] reported that approximately 33% of chest CT scans have rounded patches. In a work, a convolutional neural network (CNN) model has been developed to classify the CT images in respective classes of COVID-19 positive or negative [6]. Another model has also been created to classify images and it worked with an accuracy of around 89% [7]. Recent works have also been done to classify images using CNN architectures. Hemdan et al. [8] proposed a CNN architecture consisting of 8 convolution layers. Wang and Wong [9] proposed another CNN model which yielded an accuracy of around 92.54% in classifying 3 classes of pneumonia viz. none, non-COVID patches, and COVID patches. Ioannis et al. [2] proposed a model with 224 images and they got an accuracy of around 93.48% in their test data set. Sethy and Behera [101] proposed a method to classify images using the SVM algorithm. Apart from these, there are several deep learning works, where they proposed the model using various transfer learning methods [11,12,13,14,15,16,17].

Feature engineering is an important aspect in the field of machine learning and artificial intelligence. Selecting the subset from the original set to increase the performance is the key factor and the genetic algorithm helps to select the best attributes with high accuracy [18]. In a genetic algorithm, each chromosome is considered a solution. The algorithm operates on population and selects the best possible genes out of it. John et al. [19] used the medical field in the medical field for optimizing chemotherapy and how an artificial immune system can be implemented is illustrated. There may be many features that may not be highly correlated to the target and so the feature selection is set not fixed and here feature selection using this algorithm plays an important role and get preference over sequential forward and backward selection. H. Kim et al. [20] used for the non-linear optimization problem. It will highly improve the model [21, 22], and the removal of redundant features also helps to enhance the accuracy classification model. Hussein et al. in their work illustrated feature weighting and selection and used it in character recognition and pattern recognition [23]. Yang et al. [24] used it for feasibility testing of this algorithm in a neural network to improve the accuracy. Saidi et al. used big data and he stated that if the data is too large. For big data, it is very difficult to use this algorithm and thus parallel selection using this algorithm along with the map-reduced tools was used to overcome the problem. Santosh [25] proposed artificial intelligence-based solutions for the prediction of the different outbreaks of the epidemic across the world. It will help to identify COVID-19 outbreaks as well as forecast their nature of spread. Bhapkar et al. [26] proposed mathematical modeling to forecast the severity of the pandemic situation by predictive modeling of death tolls. As both the recovery and mortality rate change over the period, thus the authors used progressive recovery rates and progressive mortality rate for the predictive modeling. Dey et al. [27] showed a comparative study of the human mind under a prolonged lockdown period and how people reacted to such time window. Results showed under lockdown people paved their way for their passion in their house (results obtained after analyzing collected tweets), moreover, with the upliftment of the lockdown people slowly faded from their passion to their profession. Analyzing the emotions from the tweets showed most were neutral and a good share went for the worrying category. Mukherjee et al. [28] proposed a CNN architecture, where the network can be trained on both CT scan images and CXR images giving much more training data to learn the model. Their tailored DNN yielded 96.28% accuracy and a mere false negative percentage. Santosh et al. [29] proposed a dynamic model which can include complex parameters into consideration while predicting the epidemic outcome by not just only relying on SIER modeling, where many parameters are overlooked. The author also focused on the data-driven model which can auto tune the parameter value with the recent parametric values automatically. Das et al. [30] in this paper, a custom CNN network namely truncated inception net is proposed to segregate COVID-19 positive CXRs from other non-COVID cases. The proposed architecture achieved an accuracy of 99.96% in classifying COVID-19 positive cases. Mukherjee et al. [31] proposed a light-weighted CNN architecture to classify CXR samples with COVID positive cases. The proposed architecture has been designed with much fewer parametric values as compared to other models. The proposed architecture achieved an accuracy of 99.69%.

All the works in the CNN classifier model have been concentrated by both using several convolution layers and building the model from the scratch or they have used independent transfer learning models to classify, where they tested on few models and compared the accuracies and chose the best. Most of the work is flawed due to lack of input images and class imbalance for the convolution layer; thus, in our work, we overcame this problem by introducing DCGAN Architecture as a data augmentation tool. Along with that the models shown a tendency of being overfitting and only 1 transfer learning method is not just enough for reliable predictability in such a sensitive case, where the major aim is to minimize the false negative count. The world is going through a pandemic situation and as the new stains of COVID-19 is still not fully interpretable and so search space is very difficult to understand and so here genetic algorithm is an effective way to compute the conditions based on different symptoms. Hereby boosting and grid search method the best parameter is identified, and thus, hyperparameter of the classifier model is tuned. By applying GA operators on conditions the highly correlated factors here, considered as genes are identified and then the classifier model is developed to give the best accuracy and to reduce the number of false negatives cases. If the false negatives are not regulated then it may lead to the worst condition.

Proposed Approach of Ensemble-Based Transfer Learning Model Using Euclidian Weighted Average

In the current situation of pandemic radiological lung imaging of patients plays a key role as the initial screening test. The X-ray predictability can be treated as a preliminary test for COVID-19 patients and only those patients will undergo RT–PCR test if and only if they are being confirmed by X-ray prediction. The patches start developing from day 2 and get significant on day 7, as shown in Fig. 1a, b.

Fig. 1
figure 1

a Lung CT image on day 2, showing patchy pattern, ill-defined alveolar condition. b Lung CT image on day 7 showing notable patches

Data Set and Data Pre-processing

In this study, we have considered data sets taken from Open Source authentic organizations. The images are being shared as open-source from various radiologists. One of the sources is Cohen JP, and the entire data set is being compiled from few other sources also. Italian Society of Medical and Interventional Radiology (SIRM) COVID-19 DATABASE, Novel Corona Virus 2019 Data set developed by Joseph Paul Cohen, images extracted from different publications [32]. In this study, we have taken 784 training images and 278 validation images and to knockout the class imbalance problem we have augmented images using DCGAN architecture. All the training images are being scaled to \((224 \times 224)\) size as those pre-trained models are trained in such dimensions [33, 34].

Noise Removal

The CT images came with some minimal noise and to increase the model accuracy we have tried to minimize the noise. This is one of the steps of image quality being enhanced by applying noise removal methodology. We have used the average filtering method to reduce the noise. The transformation is shown in Fig. 2.

Fig. 2
figure 2

CT image after noise removal

Shadow Removal

The images often contain some shadows due to the availability and position of illumination. In this step, we have tried to segregate the shadow component from the original images (Fig. 3). In this step, we have used Canny Edge detection to remove the shadows. [13, 35]

Fig. 3
figure 3

CT image after shadow removal

Image Enhancement

In this pre-processing step, we aim to adjust and tune the color feature of the image. Here we have used an adaptive histogram equalizer to enhance the picture and contrast quality of the image by processing the histogram curve (Fig. 4).

Fig. 4
figure 4

CT image after applying histogram equalizer

Data Augmentation Using DCGAN Architecture

In the current work, the situation is extremely dynamic and changing every day and the data set is pretty small for accurate prediction of CT image class. Thus, we have augmented data to minimize the class imbalance problem and minimize the overfitting of models.

GAN works on the zero-sum principle and has 2 blocks to generate the synthetic image namely generator and discriminator.

  • Generator: the generator or the Generative neural network is mainly responsible for creating a synthetic image to get undetected. It generates the image without training the features of the image of the input data set, i.e., without learning the semantics of the input image data. The loss equation for the generator function is given in the following equation:

    $$ E_{{z \sim p_{z} \left( z \right) }} [\log \left( {1 - D\left( {G\left( z \right)} \right)} \right)] . $$
    (1)
  • Discriminator: the discriminator neural network, learns to classify that the given sample is from the same data distribution or not (Fig. 5). The major goal of a discriminator network is to detect the fake content in the set. It is a classifier network that classifies whether the image is real or not. The loss equation for the discriminator function is given in the following equation:

    $$ E_{{x \sim P_{r} \left( x \right) }} [\log D\left( x \right)]. $$
    (2)
Fig. 5
figure 5

DCGAN architecture [12]

The combined loss function for the entire model is shown in Eq. 5:

$$ L\left( {G, D_{o} \left( x \right)} \right) = \mathop \smallint \limits_{x}^{.} (P_{r} \left( x \right)\log \left( {D_{o} \left( x \right)} \right) + P_{g} \left( x \right)\log \left( {1 - D_{o} \left( x \right)} \right)) {\text{d}}x, $$
(3)
$$ L\left( {G, D_{o} \left( x \right)} \right) = \log \frac{1}{2} \mathop \smallint \limits_{x}^{.} P_{r} \left( x \right) {\text{d}}x + \log \frac{1}{2} \mathop \smallint \limits_{x}^{.} P_{g} \left( x \right) {\text{d}}x, $$
(4)
$$ L\left( {G, D_{o} \left( x \right)} \right) = - 2\log 2. $$
(5)

The images formed using DCGAN architecture are shown in Table 1.

Table 1 Synthetic CT lung images created using DCGAN Architecture for 50 and 300 Epochs.

Proposed Algorithm

The pre-trained state-of-the-art CNN classifier model is being tested at first with classifying the images into 2 major classes namely, COVID and non-COVID. The pre-trained models were used namely, ResNet50, VGG16, VGG19, Xception, and InceptionV3. Thus, a combined model always gives an upper hand in terms of accuracy and prediction. Thus, in this section, we have proposed a weighted Euclidian average method (WEAM) for higher model accuracy.

At first, all the models are being used independently, and afterward, they have been combined as an ensemble model. It is expected that the ensembled model will provide a more robust prediction. We have chosen the ResNet50 model as it is one of the contemporary models that are being used with fewer parameters with high accuracy. This model in particular is easy to train and converges fast. Other architectures which are chosen here give us higher accuracy with the considerable low amount of parameters, thus making it faster. For the InceptionV3 model, there are 11 inception layers present in the architecture.

The salient feature of our proposed model is to make a weighted average network, where the providing equation is based on weighted Euclidean average method. Let say our VGG19 model is working better than the other models i.e., it has a lower validation error in comparison to the other models which is being there. This implies that it assigned weights better to the classes better than the other models [36].

Assuming the accuracy for a kth model is being \(q_{1}\).

Therefore, the validation accuracy error is \(\left( {100 - q_{1} } \right)\).

Thus, we define a new weighted factor:

$$ d_{i} = \left( {100 - q_{1} } \right). $$
(6)

All the models are assigned \(\left( {1,1,1,1,1} \right)\) value initially, signifying all the models hold equal weightage and importance:

$$ D = \sum (d_{i}^{2} + 0.01d_{i + 1}^{2} + 0.01d_{i + 2}^{2} + 0.01d_{i + 3}^{2} + 0.01d_{i + 4}^{2} ), $$
(7)
$$ p_{i} = \frac{{d_{i}^{2} }}{D}. $$
(8)

Therefore, weight for the ith network is given as

$$ t_{i} = 1 - \sqrt {\frac{{\left( {1 - d_{i}^{2} } \right)^{2} + \left( {1 - d_{i + 1}^{2} } \right)^{2} + \left( {1 - d_{i + 2}^{2} } \right)^{2} + \left( {1 - d_{i + 3}^{2} } \right)^{2} + \left( {1 - d_{i + 4}^{2} } \right)^{2} )]}}{5}} , $$
(9)
$$ w_{i} = \frac{{\frac{1}{{t_{i}^{2} }}}}{{\sum \frac{1}{{t_{i}^{2} }}}}. $$
(10)

Now, let us assume that the output probabilities for class 1 and class 0 are in the form of \(\left[ {x_{0} , x_{1} } \right]\). Now, similarly, the predictive output probability for all the 5 networks be, \(\left[ {x_{01} , x_{11} } \right]\), \(\left[ {x_{02} , x_{12} } \right]\), \(\left[ {x_{03} , x_{13} } \right]\), \(\left[ {x_{04} , x_{14} } \right]\), \(\left[ {x_{05} , x_{15} } \right]\) for the models 1, 2, 3, 4, 5, respectively.

Now, let the weight for the models are, respectively, for models be \(w_{1} , w_{2} ,w_{3} , w_{4} , w_{5}\) calculated from Eq. 10. Therefore, average weight \(A\) is calculated using Eq. 11 [37]:

$$ \begin{gathered} A = \frac{{\{ (w_{1} \times x_{01} {)} + \left( {w_{2} \times x_{02} } \right) + (w_{3} \times x_{03} ) + \left( {w_{4} \times x_{04} } \right) + \left( { w_{5} \times x_{05} } \right)\} }}{{\left( {w_{1} + w_{2} + w_{3} + w_{4} + w_{5} } \right)}} , \hfill \\ \frac{{\{(w_{1} \times x_{11} {)} + \left( {w_{2} \times x_{12} } \right) + (w_{3} \times x_{13} ) + \left( {w_{4} \times x_{14} } \right) + \left( { w_{5} \times x_{15} } \right)\} }}{{\left( {w_{1} + w_{2} + w_{3} + w_{4} + w_{5} } \right)}}. \hfill \\ \end{gathered} $$
(11)
figure a

Results and Discussion

All the models have been trained for 50 epochs on early stopping callbacks being applied so that the epoch could be exited before the model goes overfitting (patience = 8 epochs). We have used Adam optimizer for a faster rate of convergence. Parameters used \(\alpha = 0.0001, \beta_{1} = 0.9, \beta_{2} = 0.889, \in = 1 \times 10^{ - 8}\).

The training curves for the model are shown in Figs. 6, 7, 8, 9, 10, and 11, respectively [37, 38].

Fig. 6
figure 6

Training loss curve for ResNet50 architecture

Fig. 7
figure 7

Training loss curve for VGG16 architecture

Fig. 8
figure 8

Training loss curve for VGG19 architecture

Fig. 9
figure 9

Training loss curve for Xception architecture

Fig. 10
figure 10

Training loss curve for InceptionV3 architecture

Fig. 11
figure 11

Training loss curve for proposed architecture

Evaluation Metrics

The evaluation of the proposed model has been done using the standard evaluation metrics available namely, classification accuracy, sensitivity, and F1 score. They are being calculated using Eqs. 12, 13, 14, respectively [39, 40]:

$$ {\text{Classification accuracy }} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}}, $$
(12)
$$ {\text{Sensitivity}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}, $$
(13)
$$ {\text{F1 Score}} = \frac{{2 \times {\text{sensitivity}} \times {\text{precision}}}}{{{\text{sensitivity}} + {\text{precision}}}}. $$
(14)

Here, TP represents true positive, TN as true negative, FN as false negative, FP as false positive. All the values are being calculated from the confusion matrix of our proposed algorithm.

The comparative results and metrics are shown in Table 2.

Table 2 Result of pre-trained models and our proposed algorithm.

Symptom Analysis Using Hybrid Approach of Genetic Algorithm and Classifying Algorithm

In this section, we analyze the symptoms of nCoV affected patients using a hybrid model of genetic algorithm and classifier algorithms. The aim is to find out the level of severity among the patients. In the data set which has been collected from an authentic open-source platform, there are 4 levels of severity present viz. none experiencing, low, medium, and high. The input features which has been incorporated in this study are fever, tiredness, dry cough, sore throat, no symptoms, muscle pain, age, nasal congestion, running nose, and gender.

At first, we have analyzed and measured the accuracy of the classifier model applying to the data set and then we have optimized our model using the genetic algorithm to get the optimized data set, and then it was passed through the classifier model and the accuracy of the latter one overshadowed the former one.

Data Set Description

Data set is COVID-19 Symptoms Checker, where different symptoms. It is open-source data. It has 316,800 and 27 columns. The above-mentioned data set consists of anonymous data from an infected person reported positive and admitted to the hospital. Some of the important features in the data set are 'breathing problem', 'fever', 'dry cough', 'sore throat', 'running nose', 'asthma', 'chronic lung disease', 'headache', 'tiredness' etc. This data set will predict whether anyone is having a COVID-19 or not depending on symptoms. Other features include Experience of symptoms like nasal congestion, runny nose, and severity, etc.

In the data set COVID-19 exposure: the exposures of symptoms are considered. For visualization and understanding the risk, pre-processing the raw data having columns like date of the test, date of the visit have been reduced to exposure period, diagnose period, days elapsed from showing symptoms, days are taken for visiting the hospital are considered. It can be illustrated in Fig. 12.

Fig. 12
figure 12

Graph showed days elapsed vs. showing symptom and hospital admission

Bayesian Modeling and Feature Visualization for GA Feature Selection

In the last decades, the implementations of the Bayesian hierarchical model (BHM) gained a lot of ground in Artificial Intelligence. Applying this prediction of uncertainty can be done and it is very helpful in the creation of a framework for risk analysis in the healthcare section. Health is the major wealth of humans. It is an organized representation of probabilistic relationships between input and output. It gives conditional independencies. We have modeled for severity or condition or risks of the patient. We have used the severity class as our target variable to assist the much serious patient in the first treatment. To assess the relationship among the variables with the target variable we plot the Bayesian graph, which is acyclic in nature (Fig. 13).

Fig. 13
figure 13

Bayesian plot for feature relation

Classifying Algorithm

One of the key formulation and category of it is the classification. In classification main target is to categorize the data into different classes based on features:

$$ \left( {\text{features }} \right) \to {\text{target class}}. $$
(15)

The model is trained on training sample to say I, where \(I = \left\{ {\left( {x^{1} c^{1} } \right),\left( {x^{2} c^{2} } \right), \ldots } \right\}\) where x is the input vector and c is the specified class (Fig. 14). The key features (\(f_{i}\)) that influence categorization is part of input x, where \(\in x\) is known as the best features and it is selected based on some scores [41].

Fig. 14
figure 14

Source: Jiliang Tang, Salem Alelyani and Huan Liu Feature Selection for

Framework of feature selection.

So for building a good classifier understanding the redundant features and extraction and selection of the best is a very important part of it. After the removal of redundant or irrelevant features, the accuracy of the model increases and the computation time decreases. Therefore, basically, it is a method, where a subset:

$$ Z_{m} = \left\{ {x_{i}^{1} ,x_{i}^{2} , \ldots .} \right\}{\text{where}} 0 < i \le m \; {\text{where}}\; m < n \left( { n {\text{is total no of input features}}} \right), $$

select/vectors \( X\left\{ {x_{1} ,x_{2} , \ldots .x_{n} } \right\} {\text{where}}\; n\; {\text{is the total length of the input vector}} \) to optimize the objective function. Now, there are mainly divided into three categories of feature selection filters, wrappers, and embedded selectors.

Support Vector Machine (SVM)

Support vector machine (SVM) is a supervised machine learning algorithm that is applicable for both regressions rather than logistic regression and classification. It makes a non-probabilistic binary discriminative linear classifier. It creates hyperplanes in infinite dimensions and output or prediction is done based on the optimal hyperplane. Kernel function \(\left( {fx,y} \right) \) is defined. In higher dimension hyperplane considered as a bunch of observable points whose scalar product is constant. Therefore, for higher dimension, \(k\left( {x,y} \right)\) modifies to \(k\left( {x_{i} ,y} \right)\) and so for the hyperplane, we can say that

$$ \mathop \sum \limits_{i = 0}^{n} \beta \cdot k\left( {x_{i} ,y} \right) = C. $$
(16)

A higher value of k is preferable. In addition, the functional margin has to be maintained. If there are n points then \(w.x - b = 0\) is the equation of hyperplane, where w is the normal vector and \(\frac{b}{w}\) is the offset of the hyperplane. [42]

Random Forest

The Random Forest model works on the count of impurity and the split occurs in the direction, where the impurity is least. The count of impurity is measured by 2 factors namely Gini index and entropy. Entropy is described as the amount of information needed to correctly describe a sample. A homogenous sample will give o entropy, while heterogeneous will yield 1. It is calculated as represented in Eq. 8. Similarly, the Gini index is measured using inequality in the sample. Gini index value 0 denotes that the sample is perfectly homogenous and 1 shows it has maximal inequality. It is calculated as represented by Eq. 17:

$$ {\text{Entropy}} = - \mathop \sum \limits_{i = 1}^{n} p_{i} \times \log \left( {p_{i} } \right), $$
(17)
$$ {\text{Gini index}} = 1 - \mathop \sum \limits_{i = 1}^{n} p_{i}^{2} . $$
(18)

Naïve Bayes

Naïve Bayes algorithm gives us the probability of a point \(x\) which can belong to a specified class \( C\). The probability is calculated based on conditional probability model \(P(x_{i} |C_{K} )\). The classification involves assigning the value to a class; thus, the proportion can be found using Eq. 19 [43]:

$$ p\left( {C_{a} } \right)\mathop \prod \limits_{i = 1}^{n} { }p\left( {x_{i} {|}C_{a} } \right) > p\left( {C_{b} } \right)\mathop \prod \limits_{i = 1}^{n} { }p\left( {x_{i} {|}C_{b} } \right) = > p\left( {x_{1} ,x_{2} \ldots ,x_{n} {|}C_{a} } \right) > p\left( {x_{1} ,x_{2} \ldots ,x_{n} {|}C_{b} } \right) . $$
(19)

Thus, the class can be found by mathematical notation given in Eq. 20:

$$ C = \arg \max p\left( {C_{k} } \right)\mathop \prod \limits_{i = 1}^{n} p\left( {x_{i} |C_{K} } \right), k \in \left\{ {1,2, \ldots ,k} \right\}. $$
(20)

Genetic Algorithm

One key factor in optimizing the function and evaluation of the fitness/accuracy of the model and genetic algorithm is an evolutionary, heuristic, domain-independent algorithm widely used for feature selection and optimization. The algorithm is inspired by Darwin's theory of genetics. It is well-suited for multi-featured optimization.

At first, it initializes the randomly generated population. The commencement of the algorithm is from the generation of a population and selection of optimal individuals by measuring objective function and evaluate the fitness function, which takes into account about it fitness in the environment and iterates the process, till the convergence condition is established. In other words, individuals represent one of the solutions to the problem and a set of chromosomes from the population, and the best solution is evaluated using fitness function and the best fitted is considered.

Initialization and Chromosome Encoding

The first step is to define the population and the algorithm changes populations of chromosomes (Fig. 15). They are the portrayal of the solution to the problem in string format and a locus or a specified position is known as gene and the alphabet at that place is allele. In GA, we prefer encoding in {0,1}.

Fig. 15
figure 15

Showing key components [22]

Fitness Function and Selection of Features

Assessment of the nature of the chromosome is done and according to criteria it is checked that whether it can be considered as a solution or not. The key selection building block in this algorithm is such that it can act as a guide to the evolution of the Chromosomes and so recombination is also done and having. Higher the rank/score/value more is the chance of getting selected. Some of the methods are Roulette Wheel, Random Stochastic Selection, and Truncation Selection, etc.

Recombination

The 2 major parts of recombination are crossover and mutation. Crossover is the exchange of genes, where the chromosomes are of 2 parents and mutation refers to all possible combinations (swapping and other) of the allele. After this results are fitted to the successor population (Fig. 16).

Fig. 16
figure 16

a Represent crossover and b represent mutation

Evolution

The GA algorithm iterates until and unless the stopping criteria are being reached. After recombination, a new generation is being created and it also undergoes a similar process to evolve (Table 3). A widely used evolutionary technique is used called replacement-with-elitism. Here there is an almost complete replacement of the wide population in the successor population; this model ensures that the highest fitness does not get lost in the next generation [44].

Table 3 Genetic algorithm result for feature optimization.

Proposed Algorithm

figure b

Results and Discussion

We have implemented a hybrid model, at first we have implemented the genetic algorithm-based feature optimization and the iteration is for 10 Generations (Figs. 17, 18). There are 7 parameters or independent variables; thus, the starting generation varied from \(\left[ {0,0,0,0,0,0,0\left] { {\text{to}} } \right[1,1,1,1,1,1,1} \right]\). The optimized parameter set is for the feature set given below [45].

Fig. 17
figure 17

Source: Created by Author, based on Dataset

Genetic algorithm result for 10 generation iteration.

Fig. 18
figure 18

Source: Created by Author, based on Dataset

Genetic algorithm curve across different generations.

The features tabulated in Table 4 give us the features which contribute most to the COVID-19 positive cases. Along with the feature selection, another important paradigm to increase the model performance is to optimize the hypermeters of the algorithms, and for that, we have used Grid Search Algorithm.

Table 4 Features deduced after the data set gone through genetic algorithm.

Grid search takes n equally spaced points considering each interval [aibi] including ai and bi. Therefore, the total of nm possible grid points is possible and it is calculated. Later, points (pair) are considered, the maximum of these values is chosen and the best with the highest value is returned. To overcome the overfitting ensemble of the Grid search, stratified cross validation is implemented, where the division of k-folds is done. Let lower bounds a = (a1a2,…, am) and a vector of upper bounds b = (b1b2, …, bm) for each component of ν, where ν = (ν1ν2, …νm) where target is to maximise p value.

The parametric values which are fed into the Grid Search model to get the optimized value are tabulated in Table 5.

Table 5 Grid search model parameters

Table 6 gives us the results of our model, where the optimized values of hyperparameters are given along with a comparative result of models in where one has been optimized after using the genetic algorithm as feature selection and the other is where the classifier model acted without any feature selection.

Table 6 Accuracy table of optimized and non-optimized parameter.

Concluding Remark and Future Scope of Study

In this study, we have considered a bi-folded study of classifying the images using a proposed weighted transfer learning methodology. The Euclidean Weighted Average is used as the selection equation in the proposed model. As a result, our proposed model gave a high accuracy of 98.67%. Our proposed algorithm outperformed the traditional models in both terms of validation accuracy and sensitivity. The class imbalance problem has been handled using adding more images and it has data augmented using DCGAN architecture. All the images are pre-processed by removing shadows, noise, etc. for higher classifying accuracy.

In the next section, we have proposed a dual-stage classifier to classify patients into the risk category of low, medium, and high based on the symptoms they are showing. To optimize the result we have used genetic algorithm and the classifier algorithm which are being used are SVM, Naïve Bayes, and Random Forest Classifier. The optimized model has a much higher accuracy which is being expected as 88.96%. Based on this model we can assess and analyze the risk associated with a patient with nCoV symptoms.

The work can be extended using introducing a voice classifier model by which the patient can be classified by the sound of his or her cough. Moreover, other nature inspired algorithms could be incorporated like PSO, Artificial Ant Colony, etc. [46,47,, 47].