Transfer Learning and Semisupervised Adversarial Detection and Classification of COVID-19 in CT Images

Oluwasanmi, Ariyo; Aftab, Muhammad Umar; Qin, Zhiguang; Ngo, Son Tung; Doan, Thang Van; Nguyen, Son Ba; Nguyen, Son Hoang

doi:https://doi.org/10.1155/2021/6680455

Complexity

On this page

Abstract Introduction Related Works Methods Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Complexity and Robustness Trade-Off for Traditional and Deep Models

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 6680455 | https://doi.org/10.1155/2021/6680455

Transfer Learning and Semisupervised Adversarial Detection and Classification of COVID-19 in CT Images

Ariyo Oluwasanmi,¹Muhammad Umar Aftab,^1,2Zhiguang Qin ,¹Son Tung Ngo,³Thang Van Doan,³Son Ba Nguyen,³and Son Hoang Nguyen³

Academic Editor: Dan Selisteanu

Received22 Dec 2020

Revised08 Jan 2021

Accepted01 Feb 2021

Published16 Feb 2021

Abstract

The ongoing coronavirus 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in a severe ramification on the global healthcare system, principally because of its easy transmission and the extended period of the virus survival on contaminated surfaces. With the advances in computer-aided diagnosis and artificial intelligence, this paper presents the application of deep learning and adversarial network for the automatic identification of COVID-19 pneumonia in computed tomography (CT) scans of the lungs. The complexity and time limitation of the reverse transcription-polymerase chain reaction (RT-PCR) swab test makes it disadvantageous to depend solely on as COVID-19’s central diagnostic mechanism. Since CT imaging systems are of low cost and widely available, we demonstrate that the drawback of the RT-PCR can be alleviated with a faster, automated, and reduced contact diagnostic process via the use of a neural network model for the classification of infected and noninfected CT scans. In our proposed model, we explore the benefit of transfer learning as a means of resolving the problem of inadequate dataset and the importance of semisupervised generative adversarial network for the extraction of well-mapped features and generation of image data. Our experimental evaluation indicates that the proposed semisupervised model achieves reliable classification, taking advantage of the reflective loss distance between the real data sample space and the generated data.

1. Introduction

Computer-aided diagnosis (CAD) has become an integral part of radiology and clinical diagnosis since the past decades when the emergence of sophisticated imaging techniques, like X-ray, ultrasound, and MRI, became evident. As such, valuable information is made available to experts, which requires professional evaluation and analysis for detection of abnormalities and classification of pathological traits [1].

Traditionally, the process of medical diagnosis requires manual observations based on domain knowledge; however, owing to technological advancement, newer approaches that employ computer-aided practices have been incorporated with the use of artificial intelligence and computer vision techniques significantly gaining grounds [3]. Today, the need for integrating computer algorithms and models with medical diagnosis is made all the more necessary with the global pandemic caused by the novel coronavirus SARS-CoV-2 or coronavirus disease 2019 (COVID-19), as named by the World Health Organization (WHO) [4]. Since December 2019, when it was first discovered, there have been a total of 75,110,651 confirmed cases, including around 1,680,395 confirmed deaths as reported by the WHO as on the 21st of December 2020 [5].

Owing to the nature of the disease, making it easily transmissible from patients to medical health givers, and also the unavailability of supplies such as personal protective equipment (PPE) as a result of mandatory lockdown forcing the closure of majority manufacturing factories, it is imperative to find an effective solution to handling the virus with minimal human interaction and the spread of the pathogens. COVID-19 is a disease of the lungs majorly caused by the acute respiratory syndrome, and its symptoms include excessive cough, fever, and shortness of breath leading to pneumonia in patients [6]. As such, computed tomography (CT) imaging has been used extensively to detect progression, infectious lesions, and the severity of pulmonary pneumonia in patients’ lungs in many countries. As in Figure 1, at the preliminary stages of COVID-19 pneumonia, it is noticed that chest CT images present small, subpleural, and peripheral GGO, which is rather daunting to detect [7]. Because of the complex appearances of the lesions and the complex manual task involved in the disease detection, more focus has been shifted to other automatic alternatives of analyzing the lung CT scans. Adding to these, visual fatigue increases the risks of experts missing out on necessary diagnostic details of tiny lesions.

(a)

(b)

In the past, artificial neural networks (ANNs) have achieved significant success in medical imaging, such as tumor detection and cancer screening [8]. Also, with the application of computer vision techniques as features classification, localization, and segmentation, radiologists are increasingly assisted by computer-aided diagnosis. Notwithstanding the ability of deep learning algorithms, most of these implementations require massive amount of data which are often unavailable. Also, in the more successful deep learning supervised framework where the model is trained to map the relationship of features to a label, it is challenging to get as many annotated or labeled training samples [9], especially with novel diseases such as COVID-19. For this reason, in this research, we experiment the possibility of an automatic detection of COVID-19 using multiple deep learning techniques on available volumes of lung CT scans.

First, we design a custom VGG16 convolutional neural network model for the identification of COVID-19-infected CT scans. Also, we exploit the possibility of transfer learning for the virus classification, taking advantage of pretrained models as feature extractor and then fine-tuning the new data with the model. This initiative helps to reduce training time, computational power, and the shortage of data dilemma [10]. Using the logic of the generative adversarial network [11], we compare the discriminator model against the generator model, such that the generator learns to create features similar enough to the real data distribution, thereby deceiving the discriminator. Over time, the generator learns the underlying patterns in the input data as similar as possible to the original CT scans. By updating both models dynamically, we frame the generative model in a supervised learning structure which is capable of learning internal representations of data, albeit in a semisupervised manner [12]. Since GANs are efficient at learning density distributions of inputs, they provide a creative way to learn the features of complex image structures such as CT scans. By extending the proficiency of GAN to a semisupervised framework, it resolves the problem of limited data particularly experienced with COVID-19. This is accomplished by training a subset of both labeled and unlabeled COVID-19 and normal images which could accurately generalize to unseen data. To achieve this goal, unlike the traditional GAN, we train a generator, supervised discriminator, and unsupervised discriminator model simultaneously.

2.1. CT Imagery and COVID-19 Latent Representation

COVID-19 is a family of the coronaviruses whose genetic material is encoded in the ribonucleic acid (RNA). It is therefore a positive-sense single-stranded RNA virus which could be detected in a couple of ways, including via the detection of the virus RNA or antibodies of the patients immune system [13]. As a result, the diagnosis of the disease could be accomplished by reverse transcription-polymerase chain reaction (RT-PCR) [14]. However, the process of sample collection is quite complicated, and the RT-PCR detection suffers from low sensitivity at the initial stage. On the other hand, pneumonia inflammation of the lungs’ air sacs makes it observable through a chest CT scan. As such, common COVID-19 infection manifestation such as the ground-glass opacity (GGO) becomes observable at the early stages. The condition of the pulmonary consolidation becomes detectable, especially at the later stages, allowing CT scans to produce a 3D view of the lungs [2]. This makes CT imaging a dominant diagnostic modality for COVID-19 diagnosis. The longitudinal changes and the relationships observed between multiple types of CT slices could then be carefully evaluated to provide essential information for COVID-19 diagnosis.

Although chest X-radiation (X-rays) and CT scans are the most commonly used modalities for pneumonia detection, CT scans are preferred as they provide cross-sectional images useful for 3D image reconstruction compared to X-rays which produce flattened 2D images. Furthermore, COVID-19 denotative traits are more effectively visible in a 3D view; therefore, CT scans have been widely accepted as the screening tool for the disease diagnosis [15]. Regardless of the positives, detection of COVID-19 infection from CT volumes remains a considerable challenge as there is a large variation in size, positioning, and texture of the CT data. The nature of the data, such as small interclass variance, blurry edges, and low contrast boundaries, means an intrinsic understanding of the fundamental latent representation must be ascertained to avoid false-negative detections [16].

2.2. COVID-19 AI-Based CAD Systems

Because of its high efficiency, deep learning models have been utilized for radiological imaging analysis and defect detection in patients. Over time, this system has been incorporated as an assistive tool for medical exploratory and decision-making analysis. Even with COVID-19, several systems have been proposed for the virus detection based on chest radiography images. A COVID-Net model consisting of deep convolutional neural network was designed to learn characteristic abnormalities of COVID-19 diseases from large samples of patients’ radiography images [17]. Emphasizing the low detection of RT-PCR during COVID-19 early stages, an early screening model was designed to map COVID-19 pneumonia compared to other viral pneumonia. This was established using location-attention classifier model from a 3-dimensional image set [18]. Also, a 4-stage infection extraction scheme was proposed by Rajinikanth et al. [19]. The scheme has an artefact elimination filter, Otsu thresholding image enhancement, infected region segmentation, and a region-of-interest (ROI) binary extraction.

Using the Inception convolutional neural network architecture, CT images containing manually labeled pneumonia characteristic signs as regions of interest (ROIs) were trained for COVID-19 classification [20]. The model’s experimental analysis indicates a comparable result with expert radiologists’ analysis. To solve the problem of low-intensity contrast and high variation in CT slices, an infection segmentation deep network consisting of a parallel partial decoder was designed [21]. The infection segmentation network generates a global map using an aggregate of the model’s high-level features. For data generation, a COVID-19 CT scan simulator was trained on an automatic segmentation model. The model achieves this by producing 2D images obtained from the decomposition of 3D images [22]. Multiple features from different views were extracted from CT slices of infected patients and then encoded using unified latent representation, which helps to learn every feature of each class via a backward neural network model [23].

2.3. Weakly Supervised Lesion Classification

As a step towards subduing the ongoing pandemic, rapid and accurate diagnosis of COVID-19 cases is crucial and of the essence in medical procedures [24]. Remarkably, artificial intelligence has been employed as a means of automatically analyzing medical data, including identification, classification, localization, and segmentation, mostly in a supervised context. Unfortunately, the supervised technique requires a massive amount of data and annotation, which are unavailable presently. To mitigate the problem of insufficient data, transfer learning has been adopted to maximize the availability of other antecedent datasets [25]. Using prior trained weights, COVID-19 models can benefit from increased accuracy. Also, semisupervised models have been explored to benefit the small number of labeled positive COVID-19 samples compared to the larger number of normal samples [26]. To achieve this, the generative adversarial network (GAN) architecture has been most effective. Accordingly, the unlabeled datasets are used to train the image generator, while the discriminator distinguishes the sample distribution [27].

The semisupervised GAN (SGAN) simultaneously trains a generator and a supervised and unsupervised discriminator, resulting in a supervised classifier that is capable of generalizing well to data samples which are yet unseen. As a way of dealing with limited data problem, a weakly supervised framework pretrained on a UNet model was applied for the COVID-19 infectious classification and lesion localization [28]. The method uses a combination of activation regions for connecting infectious components with high probability. A novel self-supervised model extracts features from COVID-19-positive and -negative samples, while the feature distances are learned and trained by a neural network [29]. Similarly, two 3D-ResNets and prior-attention mechanism trained as a double and discriminative classifier efficiently predict pneumonia identity probability of a CT scan [30].

3. Methods

3.1. COVID-19-Net

To achieve accurate classification of COVID-19 pneumonia lesion in CT scans, we experiment using the 2D convolutional neural network, taking advantage of its invariant property [31]. With CNN’s hierarchical and connectivity pattern of feature extraction [32], each layer of the model is able to accomplish extensive assembling of complex patterns based on their receptive field [33]. Inspired by the VGG CNN framework, we implemented our COVID-19-Net using the VGG16 architecture [34]. Following the VGG framework, the model inputs an image size of 224 ∗ 224 with a channel of 3. Each of the convolution layers has a kernel size of 3 ∗ 3, a stride of 1, and the same padding such that the input dimension is the same as the output. Every layer is then followed by a max-pooling layer of 2 ∗ 2-pixel window and a stride of 2. For proper classification efficiency, the rectified linear unit (ReLu) [35] is also introduced to induce nonlinearity.where and are learnable parameters, represents input image, is the network layer, and is the activation function. The configuration of the model follows a repeated dual block of convolutional layers followed by a pooling layer and then three stacks of three convolution blocks, each followed by a pooling layer. The model is concluded with three dense layers, where the last one is the output layer. The model output layer has two neurons representing the two classes of the dataset. Since it is a binary classification, the binary cross-entropy function is used for the model probabilistic loss computation.

3.2. Transfer-Net

Using the transfer learning technique, we construct several convolutional pretrained models as a feature extractor for COVID-19 classification. We apply five different pretrained models on the ImageNet dataset, which are the DenseNet121 [36], InceptionNetV3 [37], MobileNet [38], ResNet50 [39], and VGG16 [34] models. This allows us to save training time, as well as achieve better performance with fewer data availability. The comparison of these different models is made to investigate the effect of model size and depth on data generalization while avoiding overfitting and underfitting. As in Figure 2, the DenseNet121 model consists of 121 trainable layers, excluding the batch normalization layer. To overcome the problem of vanishing gradient in large convolutional networks, DenseNet121, in particular, combines three novel architectural schemes, which are highway networks, residual networks, and fractal networks, into the model design. This introduction helps the model to simplify pattern connectivity through maximum information flow and representational feature reuse.

The model takes advantage of CNN’s transfer learning using a pretrained model. In our transfer learning models, the last layers of the pretrained networks are removed and combined with more layers. This way, the initial network serves as a feature extractor and then further fine-tunes the new data. The pretrained models’ classifier layers are removed and replaced with three fully connected layers having 512, 256, and 2 neurons, respectively. This way, we achieve feature extraction by discovering the best representation for our dataset through the representational learning of the pretrained weights of the models. The newly added layers then help to fine-tune the representational learning of the modified model to our dataset, identifying the combination of features which are essential for the new training samples and, at the same time, reducing computational complexity and time.

Having source and target domains and , respectively, with domain having source label and source feature space , where and marginal probability distribution is represented as , then the conditional probability distribution of the source domain is given as . Therefore, the objective of transfer learning is to learn the target conditional probability distribution from the information gained from the source domain .

3.3. Generative Adversarial Network (GAN)

The generative adversarial model is framed in a supervised learning setting with a deep convolutional generator and discriminator [40]. Displayed in Figure 3, the generator learns to generate probability maps comparable to the distribution of the real data samples from random sample space. In the same manner, the discriminator learns to contradistinguish if its input is from the generated samples or the real data distribution. Over time, the generated sample distribution becomes relative to the real data that the discriminator cannot easily differentiate the two. Mostly, the generator samples noise from a uniform distribution as input and then upsamples it through a learned transposed convolutional layer of 2 dimensions. The target of the generator is to upscale its input to the same dimension as the discriminator’s input, in our case, since the input images of the discriminator are in RGB format. Each of the transposed convolutional layers is appended with both a batch normalization and ReLU activation layer; thus, the generator loss function is given aswhere is the discriminator model, is the generator model, is the generator input noise distribution, and is the real data distribution. Additionally, the discriminator that first trains on the real data distributions learns the mapping of the underlying data features to their corresponding classes, enabling it to identify the generated probability maps from the generator especially at the initial stages. So, the discriminator’s output is a scalar probability of the input being a fake or real image. Generally, the discriminator loss and combined adversarial loss are given aswhere is the discriminator model, is the generator model, is the generator input noise distribution, and is the real data distribution.

3.4. Semisupervised COVID-Net

The semisupervised model is majorly advantageous in cases where there are less labeled data but abundant unlabeled data. Both the labeled and unlabeled data are taken advantage of by training them with supervised and unsupervised discriminators, respectively. Extending the GAN model, we propose a semisupervised GAN (SSGAN) to take advantage of GAN’s ability to learn data features. This includes the training of the model on both a labeled dataset and a larger unlabeled dataset. In most cases, the SSGAN model simultaneously trains a generator model, an unsupervised discriminator model, and a supervised discriminator to predict classes, which allows it to generalize properly to unseen data. Displayed in Figure 4, the supervised discriminator trains to predict the classes of the data plus an additional placeholder class using the softmax activation function, followed by optimization by applying the categorical cross-entropy loss function. Correspondingly, the unsupervised discriminator model trains to predict if an input image is fake or real with a sigmoid activation function and optimized with a binary cross-entropy loss function.

For our model, the generator has five transpose convolution layers which upsample the input noise to a 224 ∗ 224 ∗ 3 dimension for the discriminator. The discriminator is built on the ResNet18 architecture [39]. For the supervised discriminator, the last layer is modified to a binary classifier. The unsupervised discriminator is defined as the output layer prior to the supervised discriminator’s classifier, and since the layer represents neuron activations, it is normalized using a customized lambda function defined by Salimans [41] to a value between 0 and 1 on output classes. This allows an efficient implementation of reusing the output nodes of the supervised discriminator as well as in the unsupervised discriminator. The normalized sum of the unsupervised exponential outputs is computed aswhere is the generator’s vector of input noise, is the real data distribution, and is the logit vector of classes.

4. Experimental Analysis

4.1. Datasets

A detailed clinical and paraclinical feature investigation of COVID-19 was reported by Huang [42] who reported abnormalities in patients’ CT images, having bilateral engagement. As a result, CT scans have often been used for the classification of infected patients. For this work, we gathered 1400 images containing 700 infected COVID-19 images and 700 normal CT scans. The data were collected from multiple sources including the COVID-19 Open Research Dataset Challenge (CORD-19) by the Allen Institute for AI [43], the COVID-19_Dataset from the University of Montreal [44], and the COVID-Chestxray-Dataset developed from various websites and publications [45].

The images are preprocessed by first transforming them into a tensor, and then, they are normalized to a pixel range of 0 to 1. The image dimension is fixed at 224 ∗ 224 ∗ 3, representing the height, width, and color channel of the images. For training purpose, the data are categorized binarily, with one representing COVID-19 infected and zero representing normal images.

4.2. Evaluation Metrics

To analyze the efficiency of our models, we evaluate and compare our predictions using the accuracy, sensitivity, and specificity metrics [46]. These metrics are built on the correctness of the model’s predictions compared to the true labels. The true-positive (TP) prediction represents a COVID-19 lung image correctly predicted as infected, while the false-negative (FN) prediction wrongly classifies a COVID-19 infected image as not infected or normal. Correspondingly, a true-negative (TN) prediction refers to a healthy person’s CT scan rightly classified as non-COVID, while false-positive (FP) prediction represents the misprediction of a healthy scan as COVID-19 infected. With this, the ability of the model to differentiate classes correctly (accuracy), the proportion of true positives in infected class (sensitivity), and proportion of true negatives among the healthy class (specificity) is represented as [47]

4.3. Training Details

The COVID-19-Net and Transfer-Net were both trained using the Keras 2.3.0 API on a Windows NVIDIA 1080Ti GPU, while the semisupervised generative adversarial model was trained on a PyTorch 1.7.0 version API on an Ubuntu operating system with NVIDIA 1080Ti GPU.

The COVID-19-Net follows the VGG16 CNN architecture with 13 CNN layers and three dense layers. The CNN layers all have filter sizes of 3 ∗ 3 with the same padding, resulting in the exact dimension of the input image after convolution. The convolution layers are compartmentalized into five blocks, followed by a max-pooling layer. The pooling layers have a filter size of 2 ∗ 2 and a stride of 2 ∗ 2, which results in a halved dimension of the input shape. The final dense layer has two neurons representing the binary classes of the data. For overcoming the vanishing gradient problem, each layer is followed by a rectified linear unit (ReLU) activation function, which results in a nonnegative return. The loss of the model is computed using the binary cross-entropy equation and minimized using the Adam backpropagation algorithm with 0.0001 learning rate. The model was trained for ten epochs with a batch size of 32. Figure 5 displays the accuracy and loss curve of the model as it converges.

The Transfer-Net models consist of five pretrained models, including InceptionNetV3, DenseNet121, MobileNet, VGG16, and ResNet50. For each of the models, the last classifier layers are removed and replaced with three dense ANN layers. First, the original outputs are flattened, followed by a dropout layer, then a dense layer of 256 neurons, and a final classifier layer of 2 outputs. The pretrained models were used as a feature extractor during training such that the weights of the appended layers were updated to fine-tune the model to the new data. The loss of the models is also computed using the binary cross-entropy equation and minimized using the RMSProp backpropagation algorithm with 2e − 5 learning rate. The models are all trained for three epochs with a batch size of 32.

The semisupervised model follows the adversarial learning framework of the generative adversarial network (GAN). The semisupervised model consists of a generator and supervised and unsupervised discriminators. During training, a small portion of the dataset is set aside to be trained in a supervised manner with the supervised discriminator. These images are labeled appropriately. In our work, 100 images of each of the two classes are subsamples, totaling 200. 100 other images of each class are also set aside as a test set, totaling another 200. The remaining 1000 images are set as an unlabeled set. The generator acts as in a traditional GAN by creating a fake image sample generated from random noise vector. The discriminator, on the other hand, is classified into two: a supervised one that trains on the labeled dataset and an unsupervised one that trains on the unlabeled data. The discriminator is trained on three different data types, including the generator’s generated fake images.

The generator receives a vector noise of length 100 ∗ 1 as input. The input is then upscaled using five transpose convolution layers to the 224 ∗ 224 ∗ 3-dimension shape of the discriminator input. The first transpose convolution layer has a filter size of 7, a stride of 1, and zero paddings, resulting in a dimension of 7 ∗ 7. The second and third layers both have filter, stride, and paddings of 4, 2, and 1, resulting in output shapes of 14 ∗ 14 and 28 ∗ 28, respectively. The fourth layer results in an output shape of 112 ∗ 112 given a filter size of 4, stride size of 4, and zero paddings. The last layer then results in an output shape of 224 ∗ 224 ∗ 3 with the filter set as 4, the stride as 2, and padding as 1, where 3 equals the number of filters in the final layer.

The discriminator uses the ResNet18 architecture with the last layer’s dense output modified to a neuron of two. The supervised discriminator is trained to predict the labeled classes of the data, that is, COVID-19 infected or healthy, whereas the unsupervised discriminator trains to determine if an input is either from a fake or real distribution. Both the generator and discriminator losses are computed with the binary cross-entropy function and optimized using the Adam optimizer with learning rates of 0.002. This way, the generator is optimized via the unsupervised discriminator training as the model can extract and learn useful feature from the enormous unlabeled dataset, allowing the supervised model to apply the extracted features knowledge to the class label prediction. As such, the adversarial training maximizes the divergence of the correct label distribution and the predicted label.

4.4. Statistical Analysis

4.4.1. COVID-19-Net Classification Analysis

The VGG16 designed COVID-19-Net is used to classify CT volumes into COVID-19 positive or healthy after being trained for 12 epochs. During training, of the total 700 images of each class available, 550 are set as a training set and the remaining 150 images are set as a test set. As highlighted in Table 1, our VGG16 with block max-pooling recorded an accuracy of 98.45%. The training loss decreased from 62.1614 in the first epoch to 0.0260 in the final epoch. As a result, the accuracy also increased from 55.09% in the first epoch to 98.45% in the last epoch. Evidently, the model is able to learn meaningful features from the data for the adequate classification and prediction of the input image class. In comparison to other classifier models, our model is able to predict the binary classes with more accuracy. Corresponding to the C3D algorithm, which predicted the accuracy of 96.8%, our COVID-19-Net scores a 1.64% more accuracy. Similarly, COVID-19-Net achieves a better result compared to the DeCoVNet and AD3D-MIL algorithms, which achieved 96.8% and 97.7% accuracy, respectively; thus, it outperforms them with 1.64% and 0.54%, respectively. The COVID-19-Net also achieves a 98.12% accuracy on the test set, showing that the model generalizes well to the dataset. It was, however, observed that when dropout was included in the model, the model’s classification result was adversely affected.

4.4.2. Transfer-Net Classification Analysis

Displayed in Figure 6, our Transfer-Net consists of the comparison of 5 pretrained CNN architectures to observe the advantages of transfer learning on COVID-19 classification. Since the models are trained on a different form of data (ImageNet), we use the pretrained models as feature extractors and then add new layers to fine-tune the pretrained weights to our data. The architectures include InceptionNetV3, DenseNet121, MobileNet, and the ResNet50 model. All of the five models are all trained for three epochs with a batch size of 32 and the same training set distribution as the COVID-19-Net.

In the training set displayed in Table 2, ResNet50 achieved the highest accuracy of 99.64%, followed by MobileNet, while InceptionNet achieved the least accuracy of 96.27%. For the test set, ResNet50 also maintained the highest score with 99%, while the DenseNet algorithm achieved a percentage of 92% as the least. Although the high values obtained by these models prove the advantage of transfer learning especially in the area of image classification and computer vision in general, the pattern of the prediction score of the models shows perhaps the likelihood that the deeper models are not as suitable for this task as the available dataset is simply not enough.

In comparison, the VGG16 architecture designed from scratch achieved better result than the pretrained model but it took longer time and required more computational resources. The transfer learning VGG16 architecture was only trained on 3 epochs, while the VGG16 architecture designed from scratch was trained on 10 epochs.

4.4.3. Semisupervised Generative Classifier Analysis

After training, to evaluate the performance of our SSGAN, we compute the sensitivity and specificity of our model as detailed in Table 3, that is, the model’s ability to correctly determine COVID-19 infected (true positive) CT scans and the proportion of the healthy scans (true negative). Our model was able to achieve a prediction accuracy of 94%, with a total of 48 correct COVID-19 positives of 50 images, resulting in a sensitivity of 96%. Also, of the 50 healthy images, 46 are predicted correctly, yielding a specificity of 92%. In comparison to the EBD reduced data method [51], our model achieved 13% more sensitivity and 24% more specificity. Also, compared to the EBD-98 cases [51], our model achieved 2% more sensitivity and 12% more specificity. This is indeed a good result as the SSGAN model was trained on only 100 labeled datasets, taking advantage of the semisupervised learning framework to learn useful features from the more unlabeled data.

5. Conclusion

This study presents a computer-aided analysis of COVID-19 CT scan images. The study analyzes the applicability of deep learning and convolutional neural network for the extraction of features in images and for the classification of images to predetermined classes in a supervised manner. Similarly, we leverage the concept of transfer learning in machine learning to reduce both time and computational complexity yet accomplish an excellent result. We investigate the practicability of this by implementing five pretrained convolutional architectures which yield superior classification accuracies. Additionally, owing to the limited availability of COVID-19 data, we explore the possibility of semisupervised learning by utilizing large unlabeled data to enrich a model’s ability to identify features and patterns in data. Using the highly effective generative adversarial learning technique, we develop a semisupervised classifier with two forms of discriminators for both labeled and unlabeled data, achieving state-of-the-art classification with few labeled training data. Our semisupervised model proves to be a very efficient diagnostic tool for COVID-19 classification with a prediction accuracy of 94%, sensitivity of 96%, and specificity of 92%.

Data Availability

For this work, the authors gathered 1400 images: 700 infected COVID-19 images and 700 normal CT scans. The data were collected from multiple sources, including the COVID-19 Open Research Dataset Challenge (CORD-19) by the Allen Institute for AI, the COVID-19 Dataset from the University of Montreal, and the COVID-Chestxray-Dataset.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the NSFC-Guangdong Joint Fund under Grant U1401257; in part by the National Natural Science Foundation of China under Grants 61300090, 61133016, and 61272527; in part by the Science and Technology Plan Projects in Sichuan Province under Grant 2014JY0172; and by the Opening Project of Guangdong Provincial Key Laboratory of Electronic Information Products Reliability Technology under Grant 2013A061401003.

References

H. Ayaz, M. Ahmad, A. Sohaib et al., “Myoglobin-based classification of minced meat using hyperspectral imaging,” Applied Sciences, vol. 10, no. 19, p. 6862, 2020.
View at: Publisher Site | Google Scholar
Y. Oh, S. Park, and J. C. Ye, “Deep learning COVID-19 features on CXR using limited training data sets,” IEEE Transactions on Medical Imaging, vol. 39, pp. 2688–2700, 2020.
View at: Publisher Site | Google Scholar
F. A. Khan, “Computer-aided diagnosis for burnt skin images using deep convolutional neural network,” Multimedia Tools and Applications, vol. 24, 2020.
View at: Google Scholar
World Health Organization (WHO), https://www.who.int/emergencies/diseases/novel-coronavirus-2019/technical-guidance/naming-the-coronavirus-disease-(covid-2019)-and-the-virus-that-causes-it.
Coronavirus Disease, “(COVID-19) situation dashboard,” 2020, https://www.who.int/emergencies/diseases/novel-coronavirus-2019.
View at: Google Scholar
S. Wang, “A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis,” The European Respiratory Journal, vol. 56, 2020.
View at: Publisher Site | Google Scholar
P. Feng, “Time course of lung changes on chest CT during recovery from 2019 novel coronavirus (COVID-19) pneumonia,” Radiology, vol. 295, no. 3, 2020.
View at: Google Scholar
A. Oluwasanmi, Z. Qin, and T. Lan, “Fusion of Gaussian mixture model and spatial fuzzy C-means for brain MR image segmentation,” in Proceedings of the International Conference on Computer Science and Application Engineering (CSAE 2017), Shanghai, China, October 2018.
View at: Google Scholar
A. Oluwasanmi, M. U. Aftab, E. Alabdulkreem, B. Kumeda, E. Y. Baagyere, and Z. Qin, “CaptionNet: automatic end-to-end Siamese difference captioning model with attention,” IEEE Access, vol. 7, pp. 106773–106783, 2019.
View at: Publisher Site | Google Scholar
A. Oluwasanmi, E. Frimpong, M. U. Aftab, E. Y. Baagyere, Z. Qin, and K. Ullah, “Fully convolutional CaptionNet: Siamese difference captioning attention model,” IEEE Access, vol. 7, pp. 175929–175939, 2019.
View at: Publisher Site | Google Scholar
A. Oluwasanmi, M. U. Aftab, A. Shokanbi, J. Jackson, B. Kumeda, and Z. Qin, “Attentively conditioned generative adversarial network for semantic segmentation,” IEEE Access, vol. 8, pp. 31733–31741, 2020.
View at: Publisher Site | Google Scholar
J. Goodfellow, “Generative adversarial nets,” in Proceedings of the Conference on Neural Information Processing Systems (NIPS), Montreal, Canada, December 2014.
View at: Google Scholar
Z. Xu, L. Shi, Y. Wang et al., “Pathological findings of COVID-19 associated with acute respiratory distress syndrome,” The Lancet Respiratory Medicine, vol. 8, no. 4, pp. 420–422, 2020.
View at: Publisher Site | Google Scholar
T. Ai, “Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases,” Radiology, vol. 296, no. 2, pp. 32–40, 2020.
View at: Publisher Site | Google Scholar
E. J Luz, P. L. Silva, R. Silva, and G. J. Moreira, “Towards an efficient deep learning model for COVID-19 patterns detection in X-ray images,” 2020, https://arxiv.org/abs/2004.05717.
View at: Google Scholar
M. C. Azemin, R. Hassan, M. Tamrin, and M. A. Ali, “COVID-19 deep learning prediction model using publicly available radiologist-adjudicated chest X-ray images as training data: preliminary findings,” International Journal of Biomedical Imaging, vol. 2020, Article ID 8828855, 2020.
View at: Publisher Site | Google Scholar
L. Wang, Z. Q. Lin, and A. Wong, “COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images,” Scientific Reports, vol. 10, 2020.
View at: Publisher Site | Google Scholar
X. Xu, “Deep learning system to screen coronavirus disease 2019 pneumonia,” 2020, https://arxiv.org/abs/2002.09334.
View at: Google Scholar
V. Rajinikanth, N. Dey, A. Raj, A. Hassanien, K. Santosh, and N. M Raja, “Harmony-search and Otsu based system for coronavirus disease (COVID-19) detection using lung CT scan images,” 2020, https://arxiv.org/abs/2004.03431.
View at: Google Scholar
S. Wang, “A deep learning algorithm using CT images to screen for corona virus disease (COVID-19),” medRxiv, 2020.
View at: Publisher Site | Google Scholar
D.-P. Fan, T. Zhou, G.-P. Ji et al., “Inf-Net: automatic COVID-19 lung infection segmentation from CT images,” IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2626–2637, 2020.
View at: Publisher Site | Google Scholar
L. Zhou, Z. Li, J. Zhou et al., “A rapid, accurate and machine-agnostic segmentation and quantification method for CT-based COVID-19 diagnosis,” IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2638–2652, 2020.
View at: Publisher Site | Google Scholar
H. Kang, L. Xia, F. Yan et al., “Diagnosis of coronavirus disease 2019 (COVID-19) with structured latent multi-view representation learning,” IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2606–2614, 2020.
View at: Publisher Site | Google Scholar
X. He, “Sample-efficient deep learning for COVID-19 diagnosis based on CT scans,” medRxiv, 2020.
View at: Publisher Site | Google Scholar
Y. Pathak, “Deep transfer learning based classification model for COVID-19 disease,” Ingenierie et Recherche Biomedicale, 2020.
View at: Publisher Site | Google Scholar
N. Singh, S. B. Singh, E. H. Houssein, and Ahmad, “COVID-19: risk prediction through nature inspired algorithm,” World Journal of Engineering, 2020.
View at: Publisher Site | Google Scholar
M. Ahmad, S. Shabbir, D. Oliva, M. Mazzara, and S. Distefano, “Spatial-prior generalized fuzziness extreme learning machine autoencoder-based active learning for hyperspectral image classification,” Optik, vol. 206, Article ID 163712, 2020.
View at: Publisher Site | Google Scholar
S. Shabbir and M. Ahmad, “Hyperspectral image classification – traditional to deep models: a survey for future prospects,” 2021, https://arxiv.org/abs/2101.06116.
View at: Google Scholar
Y. Li, D. Wei, J. Chen et al., “Efficient and effective training of COVID-19 classification networks with self-supervised dual-track learning to rank,” IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 10, pp. 2787–2797, 2020.
View at: Publisher Site | Google Scholar
J. Wang, Y. Bao, Y. Wen et al., “Prior-attention residual learning for more discriminative COVID-19 screening in CT images,” IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2572–2583, 2020.
View at: Publisher Site | Google Scholar
B. T. Anh, “A computer-vision based application for student behavior monitoring in classroom,” Applied Sciences, vol. 9, 2019.
View at: Google Scholar
M. T. Qadri and M. Asif, “Automatic number plate recognition system for vehicle identification using optical character recognition,” International Conference on Education Technology and Computer, vol. 338, 2009.
View at: Google Scholar
J. Zakria, J. Cai, M. Deng, U. Aftab, M. S. Khokhar, and R. Kumar, “Efficient and deep vehicle re-identification using multi-level feature extraction,” Applied Sciences, vol. 9, 2019.
View at: Publisher Site | Google Scholar
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015, https://arxiv.org/abs/1409.1556.
View at: Google Scholar
T. Kessler, G. Dorian, and J. Mack, “Application of a rectified linear unit (ReLU) based artificial neural network to cetane number predictions,” in Proceedings of the ASME 2017 Internal Combustion Engine Division Fall Technical Conference 2017, Seattle, WA, USA, October 2017.
View at: Google Scholar
G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269, IEEE, Boston, MA, USA, June 2015.
View at: Google Scholar
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826, IEEE, Las Vegas, NV, USA, June 2016.
View at: Google Scholar
A. Howard, “MobileNets: efficient convolutional neural networks for mobile vision applications,” 2017, https://arxiv.org/abs/1704.04861.
View at: Google Scholar
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, IEEE, Las Vegas, NV, USA, June 2016.
View at: Google Scholar
A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” 2016, https://arxiv.org/abs/1511.06434.
View at: Google Scholar
T. Salimans, “Improved techniques for training GANs,” 2016, https://arxiv.org/abs/1606.03498.
View at: Google Scholar
C. Huang, Y. Wang, X. Li et al., “Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China,” The Lancet, vol. 395, no. 10223, pp. 497–506, 2020.
View at: Publisher Site | Google Scholar
L. L. Wang, “CORD-19: the COVID-19 open research dataset,” 2020, https://arxiv.org/abs/2004.10706.
View at: Google Scholar
J. P. Cohen, “COVID-19 image data collection: prospective predictions are the future,” 2020, https://arxiv.org/abs/2006.11988.
View at: Google Scholar
J. P. Cohen, P. Morrison, and L. Dao, “COVID-19 image data collection,” 2020, https://arxiv.org/abs/2003.11597.
View at: Google Scholar
R. Parikh, A. Mathai, S. Parikh, G. C. Sekhar, and R. Thomas, “Understanding and using sensitivity, specificity and predictive values,” Indian Journal of Ophthalmology, vol. 56, pp. 45–50, 2020.
View at: Google Scholar
A. Swift, R. Heale, and A. Twycross, “What are sensitivity and specificity?” Evidence Based Nursing, vol. 23, no. 1, pp. 2–4, 2019.
View at: Publisher Site | Google Scholar
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,” in Proceedings of the Eighth International Conference On Computer Vision, pp. 4489–4497, Vancouver, Canada, July 2015.
View at: Google Scholar
C. Zheng, “Deep learning-based detection for COVID- 19 from chest CT using weak label,” 2020, https://www.medrxiv.org/content/early/2020/03/26/2020.03.12.20027185.
View at: Google Scholar
Z. Han, B. Wei, Y. Hong et al., “Accurate screening of COVID-19 using attention-based deep 3D multiple instance learning,” IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2584–2594, 2020.
View at: Publisher Site | Google Scholar
A. D. Berenguer, “Explainable-by-design semi-supervised representation learning for COVID-19 diagnosis from CT imaging,” 2020, https://arxiv.org/abs/2011.11719.
View at: Google Scholar

Copyright

Copyright © 2021 Ariyo Oluwasanmi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1186

Downloads

1098

Citations

Complexity

Complexity and Robustness Trade-Off for Traditional and Deep Models

Transfer Learning and Semisupervised Adversarial Detection and Classification of COVID-19 in CT Images

Abstract

1. Introduction

2. Related Works

2.1. CT Imagery and COVID-19 Latent Representation

2.2. COVID-19 AI-Based CAD Systems

2.3. Weakly Supervised Lesion Classification

3. Methods

3.1. COVID-19-Net

3.2. Transfer-Net

3.3. Generative Adversarial Network (GAN)

3.4. Semisupervised COVID-Net

4. Experimental Analysis

4.1. Datasets

4.2. Evaluation Metrics

4.3. Training Details

4.4. Statistical Analysis

4.4.1. COVID-19-Net Classification Analysis

4.4.2. Transfer-Net Classification Analysis

4.4.3. Semisupervised Generative Classifier Analysis

5. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright