1 Introduction

The first case of the virus became exposed in Wuhan city of China in November 2019, there are 1,100,000 peoples living in this city and it interfaces numerous urban communities of China. The outbreak of atypical and individual-to-individual transmissible pneumonia brought about by the severe acute respiratory syndrome coronavirus-2 (SARS-COV-2) has caused a worldwide. There have been in excess of 26,000,000 confirmed cases of the corona virus disease (COVID-19) on the globe, as of April 23, 2020. As indicated by the WHO, 16–21% of individuals with the infection have gotten seriously sick with a 2–3% death rate. The Chinese analysts named the novel infection as the 2019 novel coronavirus (2019-nCov) (Singhal 2020; Wang et al. 2006; Ge et al. 2013; Sharfstein et al. 2020; Shereen et al. 2020). SRAS-CoV, across 26 nations on the globe, contaminated in excess of 8000 people with a passing pace of 9%. According to the report of coronavirus world meter (https://www.worldometers.info/coronavirus/) till July 18, 2020, 215 countries and 2 naval ships have been affected by COVID-19 and around 14,068,506 positive cases, 595,114 deaths and 8,376,004 recovered cases have been recorded around the globe. A statistical COVID-19 outbreak has been represented in Table 1. These data are changing day by day because the COVID-19 spreading vigorously.

Table 1 COVID-19 outbreak statistics

Kermany et al. (2018a) used ConvNet model for X-ray images with training accuracy (95.31%) and validation accuracy (93.73%). Hamimi (2016) discussed about MERS CoV that includes in the chest X-ray and CT like the manifestations of the appearances of pneumonia. Xie et al. (2006) data-mining techniques were utilized to recognize SARS and typical pneumonia dependent on X-ray images. Li et al. (2018) used DenseNet-121 and DenseNet-RNN were two deep learning models utilized to analyze the infections in ChestXRay14, where DenseNet-121 getting a sum of 74.5% and DenseNet-RNN was 75.1% to recognizing Pneumonia. Rajpurkar et al. (2017) were introduced by taking 121 layers to identify one of the 14 infections at 76.8% of accuracy of the pneumonic class from the others; likewise this model gives a heatmap for possible localization that depends on the forecast done by the convolutional neural network, and more study can be found by applying machine learning and deep learning algorithm to analyze the X-ray and CT images (Basu et al. 2020; Pavithra et al. 2015; Ozkaya et al. 2020; Santos and Melin 2020; Tolga et al. 2020; Ramírez et al. 2019; Miramontes et al. 2018; Melin et al. 2018; Kermany, et al. 2018b, a; Ayan, and Ünver, 2019; Varshni et al. 2019; Wang et al. 2017; Toğaçar et al. 2019; Jaiswal, et al. 2019; Sirazitdinov, et al. 2019; Behzadi-khormouji et al. 2020;Stephen et al. 2019; Xu et al. 2020; Shan et al. 2020).

In this study, we have used convolution neural networks (CNN) method for binary classification pneumonia based on version of VGG-19, Inception_V2 and Decision tree model on X-ray and CT scan images dataset, which contains 360 images out of which 295 images of COVID-19 patients, 16 images of SARS, 18 images of streptococcus. In the next steps, it gone through image processing or noise reduction by applying a feature detection Kernel to create feature maps. The feature maps are then combined together to create vectorized feature maps. It contains images with reduced noise. These vectorized feature maps are then fed to VGG-16, Inception_V2 and decision tree to perform classification operations. In the last layer, output layer results are found in form of COVID-19, normal and pneumonic.

This paper is organized as follows: Sect. 2 method, material and proposed models have been discussed. Section 3 presents experimental results and discussions with proper explanations. Finally, in Sect. 4, we concluded the research work.

2 Methods and materials

2.1 Deep learning

It is the sub-branch of artificial intelligence field. In recent year, this technique is used. Deep learning architectures are exceptionally utilized for the determination of pneumonia since 2016 (Kermany et al. 2018a,b); the most explored deep learning strategies are VGG16, Inception_V2 and decision tree. We have picked these three techniques because of the high outcome and accuracies they offer.

2.2 Convolution neural networks

Convolution neural networks are the most widely used deep learning models for analyzing visual imagery. It uses the concept of neural networks to detect features of images. The basic blocks that a CNN algorithm uses are: convolution layer, pooling layer, ReLU layer, fully connected layer and loss layer. CNNs use an efficient regularization method to eliminate overfitting. Rather than adding magnitude weights in loss function, CNNs use the hierarchical structure of data and assemble more simple patterns to generate complex patterns for regularization.

2.3 Mathematical formula

Convolution is a special type of linear operations. Every CNN must contain at least one convolutional layer. But the convolution in mathematics and deep learning is slightly different. In deep learning, CNNs have different layers such as convolution layer, pooling, ReLU, fully connected layer and softmax layer. The classification takes place in the fully connected and softmax layers. In a CNN, the most important layer is the convolution layer.

$$ G \, \left( {m, \, n} \right) \, = \, \left( {f*h} \right) \, \left( {m, \, n} \right) \, = \mathop \sum \limits_{j} \mathop \sum \limits_{k} h\left( {j,k} \right)f\left[ {\left( {m - j} \right),\left( {n - k} \right)} \right] $$
(1)

where f = Image, h = Kernel, m, n = indexes of rows and columns of result matrix.

The above equation represents how the feature detector shifts according to the input.

Convolution function:

$$ \left( {f \, *h} \right)\left( t \right) = \mathop \int \limits_{ - \infty }^{\infty } f\left( \tau \right)h\left( {t - \tau } \right)d\tau = \, \left( {f*h} \right) \triangleq \mathop \int \limits_{ - \infty }^{\infty } f\left( {t - \tau } \right)h\left( \tau \right)d\tau $$
(2)

where t is the time index and is an integer, so f and h are integers.

Notation:A common engineering convention is:

$$ f \, \left( t \right)*h\left( t \right) \triangleq \mathop \int \limits_{ - \infty }^{\infty } f\left( {t - \tau } \right)g\left( {t - \tau } \right)d\tau $$
(3)

2.4 Proposed CNN architecture

Here in this model (Fig. 1), input taken as an image from the COCID-19 X-ray dataset, in the next step it gone through image processing or noise reduction by applying a feature detection Kernel to create feature maps. The feature maps are then combined together to create vectorized feature maps. It contains images with reduced noise. These vectorized feature maps are then fed to various models like VGG-16, Inception_V2 and decision tree to perform classification operations. But the weights present in this architecture are based on the ImageNet dataset. The model is fine-tuned to generalize the weights of the architecture model according to data of our model. In the last layer, output layer results are found in form of COVID-19, normal and pneumonic.

Fig. 1
figure 1

Proposed CNN architecture

2.5 Deep learning architecture

2.5.1 VGG16

VGG16 is considered to be one of the best CNN architecture till now (Simonyan and Zisserman 2019).

2.5.2 Inception_V2

The inception network is advanced and complex. It went through a lot of changes to increase both speed and accuracy (Szegedy et al. 2014).

2.5.3 Decision Tree

A decision tree is an algorithm that uses a tree-like structure or model of decisions and their possible consequences, including resource costs and utility (Yang et al. 2018).

2.6 Evaluation criteria

Seven unique metrics were utilized to assess the proposed method. These metrics are precision, recall, F1 score, support, accuracy, micro average and weighted average (Blum and Chawla 2001; Bhandary et al. 2020). Table 2 represents the confusion matrix.

$$ {\text{Precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}} $$
$$ {\text{Recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} $$
$$ F - {\text{Score}} = \frac{{2 \times {\text{TP}}}}{{2 \times {\text{TP}} \times {\text{FP}} + {\text{FN}}}} $$
$$ {\text{Support}} = \frac{{{\text{Frequency}} \left( {X, Y} \right)}}{N} $$
$$ {\text{Accuracy}} = \frac{{\left( {{\text{TP}} + {\text{TN}}} \right)}}{{\left( {{\text{TP}} + {\text{TN}} + {\text{FN}} + {\text{FP}}} \right)}} $$
$$ {\text{Micro Average}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} $$

where X = dependent variable, Y = independent variable, N = no of data of image.

Table 2 Confusion matrix

2.7 COVID-19 X-ray images dataset

In this paper, chest X-ray and CT scan images of 360 patients have been acquired from the open-source database (https://github.com/ieee8023/covid-chestxray-dataset), out of which are 360 images of COVID-19 patients, 16 images of SARS and 18 images of streptococcus. This repository is comprising chest X-ray/CT images for the most part of patients with acute respiratory distress syndrome (ARDS), COVID-19, E-Coli, streptococcus, pneumocystis, pneumonia and severe acute respiratory syndrome (SARS). All images were resized to 224 × 224 pixels. Figure 2 represents a sample of dataset images taken for experimental purpose.

Fig. 2
figure 2

Sample chest X-rays dataset images

2.7.1 Training and classification dataset

After we perform different cleaning steps with the data like preprocessing, splitting and data augmentation, our data are finally ready to be fed to the CNN architecture. Here this architecture model further processes the data and converts it into a form readable by the perceptions.

3 Experimental results and discussion

3.1 VGG-16

3.1.1 Normal versus COVID-19 classification report

Binary classification between COVID-19 patients and healthy people has been performed using VGG-16 model. VGG-16 model performs extremely satisfactory. The results are listed in Table 3.

Table 3 Normal versus COVID-19 classification

For the first binary class classification, i.e., normal versus COVID-19, we find 100% precision, recall and F1 score for both classes and micro-average and weighted average too. We have found 100% accuracy using our model that too without overfitting. The support is 16 for COVID-19 class, 20 for normal class and 36 for each of accuracy, micro-average and weighted average.

Here we can interpret the performance of our model as good model not only by looking at the given data but also by interpreting the curve in Fig. 3. In the above curve, we can find that initially the training loss is far greater than validation loss which shows under-fitting but as we move to further epochs we can find the losses become almost equal which shows an ideal condition of a good model. Regarding the accuracies, we find the train and validation accuracy is almost equal which shows that there is no overfitting and the accuracies show an upward trend which shows the model is performing well with each epoch.

Fig. 3
figure 3

Normal versus COVID-19 loss and accuracy on COVID-19 dataset

3.1.2 Pneumonia versus COVID-19 classification report

An X-ray and CT scan images of people affected by pneumonia have been compared with the people suffering from COVID-19. Here we are performing binary classification between COVID-19 patients and pneumonia-affected people using VGG-16 model.VGG-16 model performs well. The results are represented in Table 4.

Table 4 Pneumonia versus COVID-19 classification

For the second binary class classification, i.e., pneumonia versus COVID-19, we find 94% and 95% precision, recall and F1 score for both classes, respectively, and micro-average and weighted average are 94%. We have found 94% accuracy using our model that too without overfitting. The support is 16 for COVID-19 class, 20 for pneumonia class and 36 for each of accuracy, micro-average and weighted average.

Figure 4 represents the pneumonia versus COVID-19 loss and accuracy by looking at the graph, initially the model highly over fits but as we move further we can see that the lines come closer to each other which shows reduced overfitting. Accuracy curves are almost equal to each other which shows that our model is trained quite well.

Fig. 4
figure 4

Pneumonia versus COVID-19 loss and accuracy

3.1.3 Normal versus pneumonia versus COVID-19 classification report

In this section, X-Ray images of people affected by pneumonia are compared with the people suffering from COVID-19 and the healthy people. Multiclass classification has been performed among COVID-19 patients, pneumonia-affected people and healthy people using VGG-16 model.

Being a ternary classification model, it is quite difficult for the model to learn the fractures and differentiate the images exactly. But still this model does it quite well which is in fact better than all other models. The results obtained in this model are represented in Table 5. Result is explained as follows, for COVID-19 class we get precision, recall F1 score and support as 100%, 94%, 97% and 17, respectively. For normal people, we get the same parameters as 84%, 94%, 89% and 17, respectively. For people affected with pneumonia, we get the parameters as 90%, 86%, 88% and 91% accuracy, respectively. For micro-average, we get the average precision, recall, F1 score and support as 92%, 92%, 91% and 56, respectively. And for weighted average, these parameters are 91%, 91%, 91% and 56, respectively.

Table 5 Normal versus pneumonia versus COVID-19 classification

Figure 5 represents normal versus pneumonia versus COVID-19 loss and accuracy, here validation loss is much higher than the train loss which shows that the model over fits. But during testing, we can see that the accuracies, i.e., training and validation accuracies, are close to each other which show though the model overfits in the training set, it performs fine during testing. In fact, it is the best model that could be achieved.

Fig. 5
figure 5

Normal versus pneumonia versus COVID-19 loss and accuracy

3.2 Inception-V2 Model

3.2.1 Pneumonia versus COVID-19 classification report

Inception_V2 is a complex and advanced architecture of CNN. But talking of its cons, time complexity is a serious issue. Due to high time taking nature of this algorithm, it is not possible to perform multiclass classification using Inception_V2 architecture in conventional systems. Therefore, here the dataset is converted into a binary class by combining pneumonia and healthy patients into a single class “Normal” which represents the people not suffering from COVID-19.

The final results of classification are represented in Table 6. Class_0 represents COVID negative, and Class_1 represents COVID positive. For COVID-19 negative people, precession, recall, f1 score and support are as 85%, 80%, 82% and 83, respectively. For COVID-19 positive patients, these parameters are 69%, 76%, 72% and 49, respectively. For micro-average, parameters are 77%, 78% 77% and 132, respectively, and for weighted average these are 79%, 78%, 78% and 132, respectively. Accuracy of this model is 78%.

Table 6 Pneumonia versus COVID-19 classification

Looking at all this parameter values and the time required to execute this model, we can conclude that Inception_V2 shows quite low performance as compared to VGG-16 model.

Figure 6 consists of two graphs, interpreting the first plot that it represents the training accuracy for greater than the validation accuracy which clearly represents extreme overfitting. Second plot represents the same story where validation loss is far greater than validation loss, which represents extreme overfitting.

Fig. 6
figure 6

Inception-V2 model pneumonia versus COVID-19 accuracy and loss

3.3 Decision tree model

3.3.1 Pneumonia versus COVID-19 classification report for decision tree model

Decision tree is a simple machine learning algorithm for classification. It is not specially designed for classifying image data. Therefore, we have to preprocess the data before fitting it to the model. In the preprocessing step, we converted the data into grayscale format and converted each image into their corresponding pixel values.

Table 7 represents the classification report for decision tree model. For class_0 that is COVID-19 negative people, precision, recall, F1 score and support are 58%, 70%, 64% and 50, respectively. For COVID-19 positive patients, these values are 62%, 50%, 56% and 50, respectively. For micro-average and weighted average, first three parameters are 60% and support is 100.

Table 7 Classification report for decision tree model

3.4 Discussion

In this work, the main objective was to classify the X-ray images of people suffering from pneumonia, COVID-19 and normal people. Convolution neural networks algorithm and decision tree classification algorithm have been used. After all the necessary preprocessing, we have used two CNN architectures, i.e., VGG-16 and Inception_V2. After building and executing all the models, we found that VGG16 model outperforms all other models by a large margin. Initially, we have compared the accuracies of all the models and find that VGG-16 is the best among those. After that we compared the losses of all these models and still VGG-16 give the best results. We have also performed binary classification between every pair of the classes pneumonia, COVID-19 and normal. In case of Inception_V2 model, we find the accuracy is quite low as compared to VGG-16 model and it shows large overfitting. Table 8 represents the performance metrics for different scenarios and deep transfer models for the various parameters including testing accuracy, which contains two classes, VGG 16 achieved the highest percentage for precision, and F1 score metrics which strengthen the research decision for choosing VGG 16 as a deep transfer model and second-class, all deep transfer learning models achieved similarly the highest percentage for precision, recall and F1 score metrics which strengthen the research decision for choosing VGG 16 as it achieved the highest validation accuracy with 91%.

Table 8 Formal comparison for different scenarios

4 Conclusion

An early diagnosis of COVID-19 patients is indispensable to prevent the spread of the disease to others. In this study, the main objective was to detect COVID-19 patients using X-ray images of chest and CT scan images collected from different sources which include COVID-19 patients’ X-ray images, patients suffering from pneumonia and healthy people. Since the main focus is on detecting the X-rays of COVID-19 patients, we used the least computational intensive deep learning architecture models and still got extremely satisfactory results. As per the dataset, theVGG-16 model shows the maximum accuracy followed by Inception_V2 and finally comparing the results with a simple decision tree model using VGG16 model, to get an accuracy of 91% and support of 56 where as Inception-V2 model shows an accuracy of 78% and a support of 132. Finally, we used a decision tree model, which shows an accuracy of 60% and support of 100. The fine-tuned version of VGG-16 shows highly satisfactory results with rate of increase in training and validation accuracy of more than 91% unlike Inception_V2 and decision tree models, which shows low performance with an accuracy of 78% and 60%, respectively. In the future, we would like to study patient data in order to get an estimate of how healthcare services are being used, and their variation among different patients. Various deep learning approaches may be applied to these data for interpreting interesting results.