Abstract

At present, COVID-19 is a severe infection leading to serious complications. The target site of the SARS-CoV-2 infection is the respiratory tract leading to pneumonia and lung lesions. At present, the severity of the infection is assessed using lung CT images. However, due to the high caseload, it is difficult for radiologists to analyze and stage a large number of CT images every day. Hence, an automated, computer-assisted technique for staging SARS-CoV-2 infection is required. In this work, a comparison of deep learning techniques for the classification and staging of different COVID-19 lung CT images is performed. Four deep transfer learning models, namely, ResNet101, ResNet50, ResNet18, and SqueezeNet, are considered. Initially, the lung CT images were preprocessed and given as inputs to the deep learning models. Further, the models were trained, and the classification of four different stages of the infection was performed using each of the models considered. Finally, the performance metrics of the models were compared to select the best model for staging the infection. Results demonstrate that the ResNet50 model exhibits a higher testing accuracy of 96.9% when compared to ResNet18 (91.9%), ResNet101 (91.7%), and SqueezeNet (88.9%). Also, the ResNet50 model provides a higher sensitivity (96.6%), specificity (98.9%), PPV (99.6%), NPV (98.9%), and F1-score (96.2%) when compared to the other models. This work appears to be of high clinical relevance since an efficient automated framework is required as a staging and prognostic tool to analyze lung CT images.

1. Introduction

In recent years, SARS-type respiratory infections are becoming humans’ most prevalent health condition[14]. Such infections affect the lower and upper respiratory tract leading to pneumonia and acute respiratory distress syndrome (ARDS) [5]. Among these lung disorders, one of the recently rapidly spreading infectious communicable diseases is COVID-19 [6, 7]. In the world, a massive part of the population has been infected by SARS-CoV 2 irrespective of age and gender [8]. At the beginning of 2022, several countries, like China and Spain, faced the fifth wave of COVID-19 despite taking two doses of vaccination [9, 10]. This virus creates severe lesions of the lungs affecting their functionality and leading to a decrease in oxygen saturation. As reported by WHO, SARS-CoV-2 has five variants of concern, namely, Alpha, Beta, Gamma, Delta, and Omicron (https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/). The first symptoms of COVID-19 infections are fatigue, cold, fever, and shortness of breath [11].

An essential step in the fight against COVID-19 is the effective screening of infected patients so that they can receive immediate treatment and care, as well as isolate and mitigate the virus’s spread [12, 13]. The primary screening method for detecting COVID-19 cases is reverse transcriptase-polymerase chain reaction (RT-PCR) testing, which can detect SARS-CoV-2 RNA from the saliva and mucocele samples [14]. In addition, computed tomography (CT) imaging is conducted to analyze lung health and staging the infection. Compared to chest X-rays, the CT scan is considered more advantageous as it captures a 3D view of the lungs, which helps the radiologist identify the lesions’ exact location [15]. Hence, the CT scan has been widely utilized as a diagnostic and prognostic tool during the COVID-19 pandemic [2]. However, due to the high caseload during the pandemic, there is a critical need for automating the analysis and staging of COVID-19 lung CT images. For a better understanding of the CT images of affected lungs, thin sections of CT scans were analyzed and categorized based on the affected regions in the lungs by monitoring the predominant patterns observed in the CT scans. The patterns observed in the CT scans are classified as follows:(i)Ground glass opacification (GGO): this pattern gives an opaque (cloudy/hazy) appearance to the lungs, but one can see the underlying vessels in the lungs(ii)Crazy-paving pattern: this pattern is similar to GGO, with additional visualization observed as the thickening of interlobular and intralobular septa(iii)Focal consolidation: focal consolidation is the uniform opacification (cloudy appearance) of parenchyma regions (regions that involve in the gas exchange where alveolus is present) and observation of obscured vessels that underly the tissues(iv)Linear opacities: this is a disordered arrangement of coarse or curvilinear or fine reticulation (formation of mesh/net-like patterns) in the subpleural region

The highest rate of distribution pattern of pulmonary lesions on CT observed in COVID-19 patients such as pheripheral (subpleural region), central (lung hilum, predominantly central two-thirds of the lung), and diffuse (subpleural and central regions). These lesions were categorized as single lobe, unilateral multilobe, and bilateral multilobe affected regions. Other minor patterns observed are cavitation, air bronchogram, pleural effusion, pericardial effusion, pneumothorax bronchi ectasia, and mediastinal lymphadenopathy [16].

Among these patterns, the main CT pattern observed in COVID-19 pneumonia patient cases is GGO, with the major parts affected in the lower lobes and peripheral and sub pleural regions in the lungs. It is also noted that minor patterns were not observed in the lung scan of COVID-19 patients [17]. All these abovementioned patterns were observed and evaluated by assigning a severity score based on the degree of affected regions in the five lung lobes. The scores of each lung lobe were added and given in the range of 0 to 20, which is referred to as the lobe score.

Also, the CT severity score correlates with the duration of symptoms observed in COVID-19 patients. The early stage (0 to 2 days) has a low CT severity score, followed by a score higher than the early stage indicating the intermediate stage (3 to 7 days), followed by a score greater than the intermediate stage indicating the late stage (8 to 14 days [17]). Antifibrotic drugs such as Pirfenidone and nintedanib were administered for SARS-CoV-2 patients, which showed 50% effective treatment by enhancing lung function. These two drugs are not immune suppressive and are commercially available in oral forms. However, its inhaled formulation for COVID-19 patients is under evaluation [18]. In addition to antifibrotic drugs, an antiviral drug named remdesivir and an anti-inflammatory drug named dexamethasone were considered effective treatments for COVID-19 pneumonia.

It is also reported with evidence that remdesivir is effective in early COVID-19 cases whereas dexamethasone is effective in later or diseased conditions of COVID-19. Apart from these drugs, additional therapeutics such as baricitinib (Janus kinase inhibitor) was also recommended for administration to attenuate the symptoms associated with COVID-19 pneumonia. It is also noteworthy to mention that the combination of the abovementioned drugs showed greater efficacy and a short period of recovery from COVID-19 pneumonia. Another potential drug, tocilizumab (first marketed IL-6 blocking antibody) was considered for reducing the mortality rates in COVID-19 pneumonia. This was specifically administered to severe or critical patients who had extensive bilateral lung lesions that have shown remarkable effectiveness in reducing mortality rates and are also considered safe in clinical practice [19].

In recent years, various deep learning techniques have been introduced and applied in various fields, such as computer vision, speech analysis, natural language processing, and medicine [20, 21]. A significant advantage of deep learning methods is that complex features can be learned directly from the raw data. Deep learning techniques play an important role in medical image processing, computer-aided diagnosis, image interpretation, image fusion, image registration, image segmentation, and image-guided therapy. In addition, it can help physicians as a tool for diagnosis and risk assessment.

The main contribution of the proposed work is to compare various deep learning techniques for the classification and staging of COVID-19 lung CT images. Also, performance metrics were calculated for evaluation purposes. Here, an attempt has been made to stage the levels of ground glass opacities in COVID-19 lung images using deep learning techniques. The CT images were preprocessed, and four transfer learning-based pretrained convolution neural network models, namely, ResNet101, ResNet50, ResNet18, and SqueezeNet, were used for the classification and staging of these CT images with different ground glass opacities. Further, performance parameters such as accuracy, sensitivity, specificity, positive predicted value (PPV), negative predicted value (NPV), and area under curve (AUC) were calculated and compared.

2. Literature Review

Ismael and Şengür [1] have worked on detecting COVID-19 lesions from chest X-ray images using various deep learning techniques. The authors stated that by using these deep learning techniques, the accuracy score obtained is high compared to shallow networks. The authors concluded that feature extraction using deep learning approaches is efficient compared to traditional techniques. Barstugan et al. [2] have presented the classification of normal and COVID-19 X-ray images using machine learning techniques. The authors concluded that an accuracy of 99.68% was obtained for two-class classifications. Hall et al. [3] have reported the classification of COVID-19 and non-COVID-19 chest X-ray images using a pretrained convolution neural network. The authors demonstrate a classification accuracy of 91.24% for the two-class classification with the limited dataset. Further, the authors concluded that the performance could be improved by increasing the number of training images. Oh et al. [6] reported feature extraction for COVID-19 chest X-ray images for a limited dataset. The authors developed a patch-based convolution neural network and obtained an accuracy of 88.7%. The authors concluded that the results which were obtained were strongly interrelated with the clinical findings.

Ahuja et al. [7] developed a transfer learning-based automated model for two-class classifications of COVID-19 lung CT images. The work carried out was to classify the normal and abnormal CT images using deep learning convolution neural networks. Further, the authors made a comparison of various techniques and calculated the performance metrics and proposed that a pretrained ResNet-18 network with modified parameters provides higher accuracy results for a two-class classification. Shorten et al. [12] conducted a study on various applications of deep learning techniques for the detection of COVID-19 infection. The authors reviewed how different types of data can be given as input to deep neural networks and learning problems are being constructed.

Jain et al. [13] made a comparison of different deep learning models for identifying COVID-19-infected chest X-rays. The authors concluded that the Xception model resulted in higher accuracy when compared to other models and provided a better result compared to other models. Sujath et al. [14] developed models like Vector autoregression, Multilayer Perceptron, and linear regression to predict the spread of COVID-19 diseases in India. The authors concluded that the multilayer perceptron model has a better prediction rate compared to other models. Baskaran et al. [15] developed a deep learning-based system that is used to classify normal and pneumonia X-ray images. The authors stated that the model developed was able to classify the images, and the overall accuracy obtained using this model is also higher compared to other techniques.

Elzeki et al. [20] developed a Chest X-ray COVID Network (CXRVN) for the classification of chest X-rays. The authors have compared the developed network with the existing pretrained networks, namely, GoogleNet, ResNet, and AlexNet, and concluded that the performance parameters, such as accuracy of 94.5% obtained using the developed model, show better performance compared to the other networks. Zhang et al. [21] developed a model for anomaly detection on chest X-ray images using deep learning techniques, and the authors stated that the model is efficient in performing a reliable screening of chest X-ray images. Ghaderzadeh and Asadi [22] conducted a review on the application of various deep learning techniques being utilized for radiographic modalities. The authors concluded that parameters such as false positive rates and negative errors are reduced with the application of deep learning techniques.

It is observed that most of the researchers have considered COVID-19 and non-COVID-19 chest X-rays and CT images for classification using deep learning and traditional artificial intelligence techniques. In this work, an attempt has been made to classify normal Lung CT slices and various stages of COVID-19-infected CT slices using deep learning techniques. Also, a comparison of four deep learning techniques based on their performance metrics has been implemented.

3. Methodology

3.1. Image Acquisition

In this work, a total of 1200 lung CT images were considered from the standard COVID-19 database (https://mosmed.ai/datasets/covid191110/). The dataset consists of four classes, namely, normal CT images and three abnormal classes of CT images with glass opacification involvement of lung parenchyma less than 25%, between 25–50%, and between 50–75%. Since the availability of medical images is limited, data augmentation has been carried out. Hence, 300 images in each class are considered. Also, the dataset is divided into 70% for training and 30% for image testing purposes. Table 1 presents the dataset details utilized in the proposed work. The number of input images for each class considered is uniformly distributed, and hence there was no class imbalance.

Figure 1(a) shows the typical normal CT images. Figures 1(b) and 1(c) show abnormal COVID-19 images with ground glass opacities less than 25%, between 25–50%, and between 50–75%, respectively.

3.2. Preprocessing of the Considered Images

In this work, a pretrained transfer learning-based convolution neural network model is used for the classification staging of ground glass opacification into different stages of the infection. The input CT images were preprocessed to make them compatible with the considered CNN model. The considered images were resized to 224 × 224 × 3 for the residual networks and 227 × 227 × 3 for the SqueezeNet model. Further, data augmentation techniques like shear operation, random translation, and random rotation were performed to prevent overfitting. Figure 2 shows the sample preprocessed and augmented images considered for classification.

3.3. Convolutional Neural Network (CNN)

In recent years, CNN techniques have been utilized for the classification of images due to their higher efficiency. The temporal and spatial features which are present in an image can be easily extracted using convolution layers along with the filters in the network. Convolutional neural network reduces the computational time as it uses a technique called sharing of weights [23, 24]. A simple feed-forward artificial neural network with shared weights and neurons in the same filter connected to the image in order to protect the spatial features constitutes a convolutional neural network.

It consists of three major layers, namely, (i) a layer called the convolution layer which is used to learn the features from the image, (ii) the second layer is known as the max pooling layer where the input image is sampled which reduces the dimensionality of the features and in turn and reduces the computational efforts, and (iii) the final layer is known as the fully connected layer which helps the network for classification purpose [24, 25]. Figure 3 shows the overall block architectural diagram of a convolutional neural network.

3.4. Deep Transfer Learning Convolutional Neural Network

Since the availability of datasets is limited in the case of medical images, the transfer learning technique plays a major role in classification as well as in feature extraction. The main advantage of using this technique is that the learning process is fast when compared to other conventional techniques.

Further, by reducing the number of parameters, the time complexity can be adjusted, which becomes an added advantage of using a transfer learning-based convolutional neural network. Four different transfer learning techniques, namely, ResNet101, ResNet50, ResNet18, and SqueezeNet were considered. These pretrained CNN models were trained to classify the image net database, which consists of 1000 categories [7]. Hence, these pretrained networks have to be retained for the classification of the staging of SARS-CoV-2 lesions using Lung CT slices. SqueezeNet is considered to be the smallest deep neural network which consists of 68 layers. The input image size required by the network is 227 × 227 × 3 [26]. The residual networks require an input image size of 224 × 224 × 3. The ResNet18 model considered in this work has 71 deep layers. Also, ResNet50 and ResNet101 have 177 and 347 deep layers. Further, the initial learning rate, as well as the optimizer, is also varied accordingly to perform classification for the considered dataset.

3.4.1. Classification of COVID-19 Lung CT Images Residual Networks and SqueezeNet

Residual network also known as ResNet was initially developed for solving the problems like vanishing gradient and degradation problems. Residual network comes in three different forms depending on the number of layers, namely, ResNet101, ResNet50, and ResNet18. Also, Residual networks have been pretrained for medical images. In this work, a comparison based on the performance of three different residual networks and SqueezeNet has been performed for the classification and staging of ground glass opacities in COVID-19 lung CT images. The ResNet50 model performs better compared with other networks. Hence, the architecture of ResNet50 is explained. Figure 4 shows the architectural diagram of ResNet50.

The input images are preprocessed and resized to 224 × 224 × 3 as required by the ResNet50 model. The preprocessed input CT images are then given to the convolution layer. It creates a feature map that provides low-level features such as edges, gradient operations, color, and so on. The high-level features, such as abnormality and lesions, are obtained from deep convolution layers. The dominant features are obtained in the pooling layer. The fully connected layer acts as a feed-forward network and receives output from the pooling layer. The softmax layer has its output range [0, 1] which predicts the input to which class it belongs. The number of outputs will be equal to the number of classes considered. In this work, four classes, namely, non-COVID-19 and COVID-19 CT images with different ground glass opacities in the range of <25%, 25-50%, and 50-75% are considered for classification.

Figure 5 shows the overall work reported in this paper. A detailed analysis has been made for the selection of hyperparameters for each network considered in this work. Parameters such as the number of epochs, initial learning rate, bias learning factor, and weight learning factors were chosen and varied accordingly to train the networks.

The parameters for which the highest order of accuracy was obtained in each network were chosen as the best parameter set. Further, the performance parameters like accuracy, specificity, sensitivity, positive predictive value, negative predictive value, and F1-score were calculated for the four networks individually, and a comparison has been made among these four networks.

4. Results and Discussion

The work reported in this manuscript is divided into three divisions, (i) preprocessing of the input images, (ii) selection of hyperparameters for classification purposes, and (iii) performance evaluation of the classifiers. Initially, preprocessing of input images like resizing, rotation, and translation is performed based on the input size required by the model. Hence, the considered dataset is resized accordingly. In this work, training and testing of four-class classifications of input CT images were performed using a pretrained model. The training models considered for four-class classification problems are ResNet101, ResNet50, ResNet18, and SqueezeNet. 70% of the data is taken for training the models, and 30% of the data is considered for testing purposes.

The next step was to select the best hyperparameters for training the models. Two optimizers, namely, ADAM and SGDM, are considered for classification. Hyperparameters like the number of epochs, initial learning rate, weight bias factor, learning bias factor, and minibatch size were taken into account. Various combinations of hyperparameter values were considered, and the accuracy of each model was calculated. Table 2 presents the best-selected hyperparameters for training each model.

Figure 6(a) shows the variation of accuracy as a function of both the initial learning rate and the number of epochs, as a surface plot, for the case of SqueezeNet. It is observed that the increase in the number of epochs, with a decrease in the initial learning rate, results in higher accuracy.

A training accuracy of 91.2% was obtained for an initial learning rate of 0.0001 and with the number of epochs equal to 15.

Similarly, Figures 6(b) and 6(c) show the variation of training accuracy as a function of both the initial learning rate and the number of epochs for the case of the other models, namely, ResNet18, ResNet101, and ResNet50. In all the considered cases, it can be observed that the training accuracy increases with an increase in the number of epochs and a decrease in the initial learning rate. Results demonstrate that the ResNet50 model yields the highest training accuracy of 98.8% and testing accuracy of 96.9% when compared to all the other models considered.

Figure 7 shows the variation of training accuracy and the loss function with respect to the increase in the iterations for the ResNet50 model. It is observed that the training accuracy increases with an increase in the iterations and reaches a relatively steady state after 400 iterations. It is also observed that the value of the loss function decreases with an increase in the iterations and reaches a relative minimum after 400 iterations.

Figure 8(a) presents the confusion matrix for the training data after training the ResNet50 model. It is seen that this model is able to classify the majority of the input images into its correct classes with a training accuracy of 98.8%, sensitivity of 98.8%, specificity of 99.6%, PPV of 99.6%, NPV of 99.2%, and F1-score of 98.7%. Similarly, Figure 8(b) represents the confusion matrix for the images in the testing dataset classified using the ResNet50 model. It is observed that a testing accuracy of 96.9% is obtained, which ensures that the model is able to classify normal and abnormal images.

Any machine learning model’s efficiency can be determined by measuring the factors such as true negative rate, true positive rate, false negative rate, and false positive rate. A confusion matrix is a representation that provides an overview of the true positive rate, true negative rate, false positive rate, and false negative rate for the classes considered [24]. True positive rate (TP) is the condition where the actual class and the predicted class results are true. True negative rate (TN) shows the condition where the actual and the predicted class are false. False positive rate (FP) is the condition where the actual class shows false, and the predicted class shows true. A false negative (FN) rate is the case where the actual class shows true, and the predicted class shows false.

In this proposed work, the categories such as ct0, ct1, ct2, and ct3 correspond to normal Lung CT images, CT with GGO <25%, CT with GGO 25-50%, and CT with GGO 50-75%. A total of 840 images were utilized for training the model. Each category consists of 210 images. The normal images labeled as ct0 were classified as normal images, and their true positive rate is 100%, the false negative was 0%, the false positive is 0%, and the true negative rate is 100%. Similarly, the images labeled as ct1, which is CT with GGO 25–50%, its true positive rate is 97.1%, false negative is 1.4%, false positive is 2.9%, and true negative rate is 98.6%. Hence the TP, TR, FN, and FP values are obtained from the confusion matrix for each category.

Based on these parameters, performance metrics such as accuracy, specificity, sensitivity, positive predictive value, negative predictive value, and F1-score are commonly calculated, which determines how well the model is performing [27, 28]. The following equations are utilized to calculate the performance of the model.

The receiver operator characteristics (ROC) curve is a graphical representation of the true positive rate (sensitivity) on the y-axis and False Positive Rate (1-Specificity) on the x-axis for different cut-off values [29]. When the area under the curve value becomes maximum, i.e., AUC = 1, it means that the test which is performed by the model to differentiate between the normal and the abnormal input images is efficient [30, 31].

Figures 9(a) and 9(b) depict the receiver operating characteristics (ROC) curve. Area Under Curve (AUC) is an efficient measure of specificity and sensitivity for evaluation of the classification model. The AUC value for the ResNet50 model for training data is 0.998, and for the testing, the dataset is 0.966, which implies that this model is well-trained and can differentiate the normal images and different classes of abnormal input images in the training set as well as the training set.

The performance metric such as accuracy, sensitivity, specificity, F1-score, negative predictive value, and positive predictive value was calculated for each model. Tables 3 and 4 present the performance metrics of the models considered for classification using the training dataset and testing dataset, respectively.

Figures 10(a) and 10(b) show the performance metric plots for all the considered pretrained model for classification. It is observed from the plot that the ResNet50 model gives the highest performance metrics when compared to other networks for the training data.

Also, Figures 11(a) and 11(b) present the performance measure plots for a testing dataset of a pretrained model considered for classification. It is inferred from the plot that for the training dataset considered, the ResNet50 model gives high-performance parameters compared to other networks.

Table5 presents a summary of various existing COVID-19 detection and classification techniques using deep learning algorithms. It is observed from the table that machine learning techniques along with medical images such as radiographs act as a better diagnostic tool for the early detection of human disorders. From the brief analysis of the COVID-19 research diagnosis, it can be inferred that there is insufficient chest radiographic data available for COVID-19 detection, and deep learning techniques are useful for COVID-19 detection. The standard approach to COVID-19 detection, which uses PCR kits and reverses transmission polymerase chain reaction (RT-PCR), has some limitations, including a long cycle time and very low sensitivity, i.e., 89%. Most researchers have considered chest X-ray imaging and chest CT scan modality as diagnostic tools for early detection of COVID-19 in patients.

In contrast to chest X-ray images, CT scans are more effective for detecting COVID-19 lesions because they can give a full 3D perspective of the organ, enabling a more accurate diagnosis of the abnormality’s nature. Since data availability is limited, the diagnosis of abnormalities from a limited dataset of chest radiographs of COVID-19 patients is therefore proven to be possible using transfer learning in conjunction with data augmentation. Also, it is observed from the literature that most researchers have considered two categories, namely, COVID-19 and normal CT scans and Chest X-rays, for classification using deep learning techniques. Table 4 shows a brief analysis made by the researchers on using different deep learning techniques for chest X-rays and CT images. Results demonstrate that ResNet50 shows a better classification accuracy when compared to other models considered for classification. In addition, it is observed that deep learning techniques provide an efficient staging of SARS-CoV-2 lesions for the considered dataset.

5. Conclusion

SARS-type respiratory infections are one of the most prevailing health conditions affecting upper and lower respiratory organs in recent years [38]. Among these respiratory disorders, most of the world’s population is affected by COVID-19. A critical step in to fight against this COVID-19 is to provide effective screening and isolate the infected people to reduce the spread of COVID-19. Though the RT-PCR test is considered a better method of screening, it has its disadvantages, as one has to wait long hours to obtain the result. Therefore, medical imaging such as X-ray and CT scans plays a major role in the early detection of COVID-19 infections. CT scans are preferred among these imaging modalities as they show a 3-dimensional view of the organs. However, due to the increase in the caseload, it is difficult for radiologists to perform mass screening and staging of infections. Hence, computer-assisted staging of SARS-CoV-2 infections is required. In recent years, deep learning techniques have played a major role in medical imaging technology. Various deep learning techniques have been utilized classification of medical images.

This work compares different deep learning techniques to classify the normal and different stages of abnormal SARS-Cov 19 lung CT images. The overall work is divided into three phases. In the first phase, the images are resized based on the size required by the pretrained model, and data augmentation techniques such as rotation, shear, and translation are applied to increase the size of the dataset. Four-class classification techniques are performed in the second phase using pretrained transfer learning models such as ResNet101, ResNet50, ResNet18, and SqueezeNet. Here, hyperparameters are varied and optimum values are chosen for which the maximum training accuracy is obtained. The third phase is to calculate the performance metrics for each pretrained model. Among these four pretrained models, ResNet50 shows better results than other pretrained models. This model yields a testing accuracy of 96.9%, a sensitivity of 96.6%, a specificity of 98.9%, and an AUC of 0.998. It is observed that the ResNet50 model performs well for the classification of normal and the abnormalities of various ground glass opacities of COVID-19 images. In the future, this model can be considered for automated clinical examination of COVID-19 detection and staging using lung CT images.

Data Availability

The standard open-source COVID-19 database (https://mosmed.ai/datasets/covid191110/) was utilized for this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.