Novel coronavirus pneumonia detection and segmentation based on the deep-learning method

Zhiliang Zhang; Xinye Ni; Guanying Huo; Qingwu Li; Fei Qi

doi:10.21037/atm-21-1156

Original Article

Novel coronavirus pneumonia detection and segmentation based on the deep-learning method

Zhiliang Zhang¹, Xinye Ni², Guanying Huo¹, Qingwu Li¹, Fei Qi²

¹College of Internet of Things Engineering, Hohai University, Changzhou, China; ²Changzhou Second People’s Hospital Affiliated to Nanjing Medical University, Changzhou, China

Contributions: (I) Conception and design: F Qi, Z Zhang; (II) Administrative support: Q Li; (III) Provision of study materials or patients: G Huo, X Ni; (IV) Collection and assembly of data: F Qi, Z Zhang; (V) Data analysis and interpretation: F Qi, Z Zhang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Fei Qi. Changzhou Second People’s Hospital Affiliated to Nanjing Medical University, Changzhou 213000, China. Email: 814410792@qq.com; Qingwu Li. College of Internet of Things Engineering, Hohai University, Changzhou 213000, China. Email: 1834186657@qq.com.

Background: Segmentation of coronavirus disease 2019 (COVID-19) lesions is a difficult task due to high uncertainty in the shape, size and location of the lesions. CT scan image is an important means of diagnosing COVID-19, but it requires doctors to observe a large number of scan images repeatedly to determine the patient’s condition. Moreover, the low contrast of CT scan and the presence of tissues such as blood vessels in the background increase the difficulty of diagnosis. To solve this problem, we proposed an improved segmentation model called the residual attention U-shaped network (ResAU-Net).

Methods: A novel method to detect and segment coronavirus pneumonia was established based on the deep-learning algorithm. Firstly, the CT scan image was input, and lung segmentation was then realized by U-net. Then, the region of interest was selected by the minimum circumscribed rectangle clipping method. Finally, the proposed ResAU-Net, which includes attention module (AMB), residual module (RBM) and sub-pixel convolution module (SPCBM), was used to segment the infected area and generate the segmentation results.

Results: We evaluated our model using cross-validation on 100 chest CT scans test images. The experimental results showed that our method achieved start-of-the-art performance on the pneumonia dataset. The mIoU and Dice cofficients of Lesion segmentation were 73.40%±2.24% and 84.5%±2.46%, and realize fast real-time processing.

Conclusions: Our model can effectively solve the problems of poor segmentation accuracy in the segmentation of COVID-19 lesions, and the segmentation result image can effectively assist medical staff in the diagnosis and quantitative analysis of infection degree, and improve the screening and diagnosis efficiency of pneumonia.

Keywords: Novel coronavirus pneumonia diagnosis; lesion segmentation; attention mechanism; sub-pixel convolution; deep learning

Submitted Dec 30, 2020. Accepted for publication May 13, 2021.

doi: 10.21037/atm-21-1156

Introduction

In early 2020, a coronavirus disease 2019 (COVID-19) broke out suddenly and spread rapidly all over the world, causing a global health crisis. According to statistics released by the World Health Organization (WHO), as of July 15, 2020, 13,203,571 new coronavirus patients had been diagnosed and 575,201 patients had died. The epidemic prevention situation is still very serious.

Studies have shown that the novel virus primarily attacks human lungs, leading to lung infection and respiratory diseases (1). Therefore, in the diagnosis of novel coronavirus infection, patient’s chest computed tomography (CT) images play an important role (2). According to research, the main CT manifestation of patients with novel coronavirus infection is the presence of bilateral and peripheral ground glass opacity (GGO) (3). Relevant medical staffs preliminarily determine whether patients are infected with novel coronavirus mainly through the manual observation and analysis of lung CT images. However, hundreds of thousands of CT images are required during each patient’s diagnosis and doctors need to identify novel coronavirus pneumonia from complicated image features. This requires not only the doctors’ experience in the diagnosis and differentiation of pneumonia, but also the observation time, which seriously affects the screening efficiency. Moreover, the pulmonary arteriovenous vessels are numerous and the background is complex, which increases the diagnostic difficulty for medical staff, as well as the possibility of misdiagnosis and misjudgment. Therefore, an automatic segmentation system that can accurately define the boundary of the region of interest in the lung and help doctors quickly diagnose the severity classification of novel coronavirus pneumonia infection is urgently needed.

With the improvement of computing capacity, deep-learning has achieved rapid progress in recent years, with a wide range of applications in the medical field (4), including using convolution neural networks to judge whether a malignant tumor is included in the radiograph, as well as the determination of cardiovascular risk, etc. In the semantic segmentation task, from the original Fully Convolutional Networks (FCN) (5) to the U-net (6) series, the accuracy of segmentation is constantly improving. The U-net series network is mainly used in medical image segmentation, and can effectively complete the segmentation of tissue and organ images. In order to improve the screening efficiency of novel coronavirus pneumonia and reduce the workload of doctors, novel coronavirus pneumonia has been linked to deep-learning for automatic detection using computer technology, which makes the qualitative and quantitative study of novel coronavirus pneumonia promising. Wang et al. from the University of Waterloo in Canada first proposed the COVID-NET model (7). This is a customized neural network that extracts features from convolution and other operation targets, and detects COVID-19 cases from X images, yielding a prediction value of 90.9% for novel coronavirus pneumonia. Xu et al. from Zhejiang University Affiliated Hospital proposed the use of a three-dimensional convolution network to achieve novel coronavirus pneumonia classification (8). Although the above methods are helpful in the preliminary screening of novel coronavirus pneumonia, they are not a quantitative analysis of infection severity because they only provide a classification of novel coronavirus pneumonia and common pneumonia. However, they do not complete the detection and segmentation of novel coronavirus pneumonia, and thus cannot provide references for doctors to propose further treatment measures.

Therefore, the current study realized the detection and segmentation of the lung and lung lesions through the use of CT images, which can help doctors to make accurate judgments on whether a patient is infected or not, and ascertain the degree of infection. This novel method is convenient for doctors to analyze and gain a more comprehensive grasp of the patient’s condition, which can improve screening efficiency and reduce the risk of misdiagnosis. In the identification and segmentation of the infected area, the network first detected and segmented all of the lungs from the CT images, and reduced the target search space to improve the accuracy of segmentation. Next, the minimum circumscribed rectangle method was used according to the segmented lung mask to cut the CT image, in order to remove the tissues that interfere with the detection and segmentation of the blood vessels around the lung. Subsequently, the network detected and segmented all of the lungs after cutting. The novel coronavirus pneumonia-infected area was automatically segmented by CT imaging. Lastly, the final prediction result was generated by combining the CT images, the segmentation result mask of the lung, and the segmentation result mask of the infected area. As shown in Figure 1, the uninfected area of the lung is marked with green and the infected area is marked with red.

Figure 1 Realization procedure.

In this study, we used different segmentation models to carry out a series of experiments, and found that the accuracy of lung segmentation was not different under the same conditions, and the average Intersection-over-Union was approximately 96%. Due to the obvious lung characteristics, uniform shape distribution, and large target area, and taking into account the network parameters, computational burden and accuracy, we used a relatively simple U-net to achieve lung segmentation in the CT images. The segmentation accuracy was better than most cutting-edge models, and can meet the needs of auxiliary diagnosis. Furthermore, according to the lung segmentation prediction results, the minimum circumscribed rectangle method was used to extract the lung part from the CT images. In addition, since the shape of the infected area was complex, the location was not fixed, the infection characteristics were highly varied, the contrast between infection and normal tissue was low, and the new pulmonary inflammation CT image dataset was lacking, we optimized the U-net network structure and loss function, and proposed ResAU-Net for infected area segmentation. This network uses ResNet34 (9) as the compression coding feature extraction network on the basis of the U-net, and a visual attention mechanism (10) is added to the expanding decoding path, which makes the network pay more attention to the region of interest and improve the accuracy of the model. Moreover, sub-pixel convolution was used in decoding to achieve feature image upsampling and make use of the low-level detail features more effectively, so as to reduce the interference of artificial factors (11). Meanwhile, during the training of the network model, the weighted loss cross entropy function was used. This can improve the sample’s equilibrium, improve the segmentation accuracy of the infected area, complete the segmentation of the infected area, and divide the average Intersection-over-Union to73.40%±2.24%, which can accurately provide a reference for medical staff to evaluate the patient’s condition.

We present the following article in accordance with the MDAR checklist (available at http://dx.doi.org/10.21037/atm-21-1156).

Methods

Lung parenchyma segmentation

Originally proposed by Olaf et al., U-net is a deep neural network for biomedical image segmentation, which comprises two paths: a shrinking coding path and an expanding encoding path. The shrinking coding path is a conventional convolutional network structure, including convolution, pooling subsampling, and activation, etc. The pooling step length is 2, and the pooling core is a 2×2 (maximum pooling). The number of channels after each pooling is twice of that before pooling, and the total number of pooling times is 4, which means that the final feature image obtained by the coding path is 1/16 of the original image. In the decoding path, the size of the feature image was enlarged by transposition convolution, and the feature image corresponding to the same resolution size on the encoding path was spliced at the channel level, and a new feature image was then obtained by convolution. After four rounds of deconvolution, the feature image with the same size as the input image can be obtained. Finally, the pixel level prediction was realized by using 1×1 convolution to shrink the channel number of the feature image.

U-net network is a kind of full convolution semantic segmentation network model, which uses a symmetric encoding and decoding structure, and can effectively capture multi-scale targets and optimize the edge segmentation results. Using splicing feature images on the encoding path during decoding, the U-net network can improve the utilization of low-level features, recover some low-level semantic information and targeted-space information, and effectively improve the accuracy of image segmentation.

U-net++, attention U-net, and other networks (12,13) are different improvements based on U-net network. U-net++ increases the connection between the feature images on the encoding and decoding paths, so as to improve the utilization rate of low-level feature information, better recover the spatial information, and improve the prediction accuracy. Attention U-net adds the visual attention block to improve the weight of important information in feature images, which can improve the segmentation accuracy of the target. Our study indicated that the lung segmentation accuracy of different models under the same conditions was not different, and the average Intersection-over-Union was approximately 96%. Therefore, a simple and effective U-net was selected to complete lung segmentation in the CT image that took into consideration the network parameters, the computational burden, and the segmentation accuracy. This ensured that the basic completion of lung segmentation, which can provide a good basis for the subsequent segmentation of lesions.

The training of U-net, U-net++, and Attention U-net networks used a total of 301 images of lung CT images and lung segmentation mask tags, and no data enhancement was used [the training set iteration was 20, each training number was 4, and the learning rate was 0.001; the Adam gradient descent optimization method was employed (14)]. Even if the dataset was relatively small, the network also converged to the optimal parameter value in 20 iterations, and the segmentation accuracy reached a high level. Table 1 displays the performance indicators of the testing set in each network model, including mIoU, Dice coefficient and model parameters, as well as the time required to predict a single image.

Table 1 Comparison of lung segmentation performance in each model
Full table

Novel coronavirus pneumonia segmentation

The segmentation of the infected area was completed by ResAU-Net. The network was divided into three main blocks (15): an encoder block, a decoder block, and an Attention Module Block (AMB). The encoder block was composed of four ResNet Blocks (RB) and the decoder block consisted of four sub-pixel convolution blocks (SPCB). The network structure is shown in Figure 2.

Figure 2 Structure of ResAU-Net.

ResAU-Net input the clipped CT images, completed the compression and coding of the feature images, extracted the target object features, reduced the size of the feature images, and expanded the depth through four ResNet34 RB. The sub-pixel convolution block was subsequently used to realize feature decoding and expansion through the corresponding four SPCB. At the same time, the AMB combined the current feature images and the corresponding low-level feature images, and a new low-level feature image, including the attention mechanism, was calculated and fused with the feature image at the channel level to realize the effective fusion of context information and improve the accuracy of the network. Finally, the feature image was restored to the same as the input image through four sub-pixel convolution and double convolution, and the 1×1 convolution was used to reduce the dimension of the channel number of the feature image, complete the channel interaction and information integration, and obtain the prediction segmentation results

Encoder

The coding path of ResAU-Net was composed of pre-trained ResNet34 network. The ResNet series of neural networks were proposed by He et al., which can effectively solve the problem of worsening performance of neural networks with increasing network depth, and avoid the gradient degradation problem that may occur in the process of training, so that the network can converge to a better parameter value and improve the performance of the model. The ResNet34 used in this study included four RB, each of which was composed of three or more ResNet Basic Blocks (RBB). Specifically, RB1 had three RBBs, RB2 had four RBBs, RB3 had six RBBs, and RB4 had three RBBs. The details of the RBB network structure are shown in Figure 3A. The batch regularization and activation layers were responsible for encoding the input feature image into the low dimensional feature space.

Figure 3 Detailed structure of ResAU-Net block. (A) Basic residual module. (B) Contextual attention module. (C) Subpixel convolution module.

Attention block

The attention mechanism is important in human visual information processing, and can enable the brain to obtain key target details, suppress useless information, and improve the accuracy and computational efficiency of visual processing information. The attention mechanism is widely used in natural image analysis, knowledge mapping, and natural language processing, especially in image description, machine translation, and classification tasks (16).

The attention block of ResAU-Net is composed of a convolution layer, a regular activation layer, and a sigma function. The detailed structure of the attention block is shown in Figure 3B. This a gating attention mechanism based on saliency. The block input is a low-level feature image and gating signal in which the gating signal was the feature image corresponding to the low-level feature image. The feature image was calculated via sigma function through convolution, batch regularization, pixel level fusion, and activation of the two inputs. Subsequently, the attention weight feature image was obtained, and was multiplied with the input low-level feature image to obtain the feature image with attention focus.

Sub-pixel convolution block

The decoder consists of four SPCB. The sub-pixel convolution is used to rearrange the pixels, restore the resolution of the feature image, and complete the segmentation of the lesion area. The detailed structure of the sub-pixel convolution block is shown in Figure 3C. Each sub-pixel convolution block consists of a sub-pixel convolution layer and a double convolution layer. The sub-pixel convolution layer includes pixel shuffling and ReLU activation layer, which sacrifices the channel number of the feature image to recover resolution. The double convolution layer is composed of convolution, batch regularization, and double activation layers. Compared with deconvolution, sub-pixel convolution has the same amount of calculation, but has a better model capacity and test error, which can reduce the interference of human factors and improve the segmentation accuracy. In addition, in order to avoid the chessboard effect caused by random initialization of sub-pixel convolution parameters, icnr parameter initialization sub-pixel convolution (15) was used, which effectively alleviated the chessboard problem. Moreover, in order to improve the segmentation accuracy, the feature image output by upsampling was combined with the encoder feature image weighted by the attention block, so as to increase the connection of context features, make full use of features, and improve the accuracy rate.

Experimental dataset

The dataset used in this study was from an open-source dataset of CT scans from the University of California. The original dataset used for the training and testing network contained 903 images, which were divided into three parts. The first part contained 301 CT scanning images, the second part contained 301 lung segmentation label mask images corresponding to CT images, and the third part contained lung infected area segmentation label mask images. The images of the three parts were all gray images with a size of 512×512, 8 bits. However, the data were expanded to three channel data by way of channel copy. Example images from the dataset are shown in Figure 4A,B,C.

Figure 4 Example images from the dataset. (A) CT image; (B) mask image of lung; (C) mask image of infected area.

Additionally, in the training of the lung segmentation network LU-net, the first and second parts of the original dataset were used. Since the shape and size of the lung were fixed, the detail texture was relatively consistent, and the characteristics were obvious; the lung segmentation effect was obvious without the use of enhanced data. Therefore, the data enhancement strategy was not utilized in the training of LU-net; there are 301 images in the data set of training lung parenchyma segmentation network, 270 of which are used as training images and the others are test images. Also, data from the first and third parts of the original dataset were used to train the recognition network of the infected area. Furthermore, due to the fact that the shape of the infected area was changeable and complex, the location was not fixed, the infection characteristics were highly variable, and the contrast between the infection and normal tissues was low, a flipped data augmentation strategy was used for samples containing the infected area in the dataset in order to improve the generalizability of the model and ensure the CT image features of the lung scan. After processing, there were 441 images in the segmented dataset with 397 training sets and 44 testing sets.

Quantitative evaluation metrics

A commonly used performance evaluation index for image semantic segmentation is Intersection-over-Union (IoU), which is used to represent the coincidence degree between the predicted region and the target region of a certain category of image. For a certain type of label, we can obtain the result by calculating the ratio of the intersection and union of the pixel set of Ground Truth (GT) and Prediction (P). The calculation formula is shown in Eq. [1]. Furthermore, in order to effectively compare the computational complexity of each model, the processing time of each image in each model was tested as one of the reference standards.

$I o U = \frac{G T \cap P}{G T \cup P}$ [1]

In order to ensure the generalization performance of the model, it was usually necessary to calculate multiple testing images. The average Intersection-over-Union (mIoU) was calculated by taking the average strategy, which can reflect the prediction robustness of the model more accurately. The calculation method is shown in Eq. [2]

$m I o U = \frac{1}{n} \sum_{i = 1}^{n} I o U_{i}$ [2]

where n represents the number of testing images, and IoU_i represents the Intersection-over-Union of GT and P.

Dice coefficient (17) is another common index for evaluating the segmentation quality of the medical image segmentation algorithm, (also known as the F1 score), and is used to measure the similarity of target objects. It is calculated by the area size of the overlapping part of two objects divided by the total area of two objects, according to the following Eq. [3]:

$D i c e = \frac{2 \times T P}{2 \times T P + F P + F N}$ [3]

where TP is the True Positive, FP is the False Positive, and FN is the False Negative. For image semantic segmentation, Dice is usually calculated using Eq. [4]:

$D i c e (G, P) = \frac{2 \times | G \cap P |}{| G | + | P |}$ [4]

where G represents the pixel set of Ground Truth (GT), P represents the pixel set of Prediction (P), and $| X |$ , $| Y |$ represents the number of two sets of pixels.

Implementation details

In order to achieve optimal network performance, numerous exploratory experiments were carried out to identify the best combination of network parameters. The training dataset of novel coronavirus pneumonia was composed of 441 lung CT images and 441 images of infected area mask truth, and the image size was 512×512. Among them, 90% were training sets and 10% were testing sets. The PyTorch [Facebook AI Research (FAIR), America] deep-learning framework was used to build the model, the programming language was Python, and the hardware configuration of the experiment was Intel 6-core processor (Corei7-9750H@2.60 GHz) (Intel Corporation, America). GPU acceleration hardware is NVIDIA (NVIDIA Corporation, America) GeForce GTX-1660Ti 6GB graphics card, 16GB memory server. The training parameters were set as follows: learning rate was 0.001; training image batch size was 8; loss function was Binary CrossEntropyLoss (BCELoss). After 60 iterations of the training set, the network loss was basically unchanged, and the model converged to a better parameter set.

Data analysis

We used ten-fold cross-validation experiments to evaluate the segmentation performance of the model on the pneumonia dataset, which contained 441 CT scans. Values are expressed as means standard error of the mean (S.E.) for all experiments. We divided the dataset into 10 parts, one of which was selected as the test set and the other nine parts were selected as the training set and used for training the model. When the training process was completed, the mean IoU and Dice coefficient of the model on the test set were calculated. Repeat the above experiment until all the data were used as a test set, then averaging the mean IoU and Dice coefficient obtained in each test to obtain the final quantitative performance evaluation value of the model.

Results

Novel coronavirus pneumonia lesions in ResAU-Net are shown in Figure 5 and Table 2, and were compared with U-net, U-net++, Attention U-net, and DeepLabV3Plus (18). Our study showed that the average Intersection-over-Union of ResA-UNet network was improved by approximately 10%, and the Dice coefficient was increased by approximately 4% compared to the other segmentation models. The segmentation accuracy reached a very high level, and the reasoning time of the model was basically unchanged. The visual effect of each model is shown in Figure 6; the LU-net network was used for lung cutting and different network models were used for segmentation of infected area.

Figure 5 Segmentation accuracy of different networks. The “•” represents the Dice coefficient, and the “•” represents the mIoU.

Table 2 Comparison of segmentation performance of pneumonia lesions in different networks
Full table

Figure 6 Comparison of segmentation results of different models. The first column is input CT scans, and the second column is GT label visualization image, then the subsequent columns are successively visualized images of the segmentation results of U-Net, U-Net++, Attention U-Net, Deeplabv3plus and Ours model. Among them, the green labeled part is the lung area, and the red labeled part is the infected area. GT, Ground Truth.

The method in this paper integrates the symmetrical codec structure, attention mechanism, and jump connection, and uses sub-pixel convolution up sampling to restore the feature map size. It can be seen from the experimental results of the new coronary pneumonia infection area segmentation experiment in Table 2, whether it is in complete segmentation in terms of accuracy or segmentation accuracy, the method in this paper is superior to other comparison algorithms. Evaluation indicators such as average cross-to-bin ratio and F1 score have reached a high level, which is effective for segmentation analysis of new coronary pneumonia, despite the network processing time slightly increased, but the processing speed is still very fast, meeting real-time processing requirements.

Discussion

This paper presents a new model for medical image segmentation of new coronary pneumonia. Novel coronavirus pneumonia is a new method to solve the problem of segmentation. The method eliminates the necessity of applying external object location model, improves the accuracy and completeness of segmentation of new crown pneumonia infection area, and improves the segmentation effect of edges. This method has the characteristics of generality and modularity, and can be easily applied to other disease infection image segmentation problems, such as lung cancer. The experimental results show that the proposed attention mechanism combined with sub-pixel convolution method has a high effect on the identification and location of tissue or organ infection. This is especially true for other small organs

Compared with the basic symmetric encoder decoder structure U-Net, the method proposed in this paper has a symmetric structure for mining deep features, and at the same time introduces jump connections, increases the connection between high and low layers, and improves the model’s ability to understand features. Therefore, in the segmentation experiment results, the method in this paper far exceeds the U-Net and U-Net++ networks in terms of segmentation accuracy and completeness. In addition, the introduction of visual attention improves the model’s attention to segmentation targets and reduces background interference. The sub-pixel convolution introduced by the super-resolution reconstruction task has improved the characterization ability of the model feature decoder to a certain extent. From the experimental results, the segmentation and intersection ratio of this method is higher than that of Attention which only contains the attention module. U-Net, the segmentation accuracy has been effectively improved due to the introduction of sub-pixel convolution.

To prevent the spread of COVID-19, we should travel less, wash our hands frequently, wear masks when going out, and avoid gatherings and public places as much as possible. As of April 09, 2021, about 134.6 million cases of the epidemic have been confirmed worldwide. Thanks to the efforts of medical workers in various countries, 103.24 million cases have been cured. Combined with the compulsory intervention of governments in most countries, the spread of the epidemic has been greatly curbed. Several countries, including China, have developed a new inactivated coronavirus, which has been approved by the State Administration of Medicine and is now in mass production and use. It is believed that with the increasing number of people vaccinated, the virus will eventually die out.

Conclusions

The diagnosis of novel coronavirus pneumonia is based on the use of CT images. The deep-learning algorithm was used to detect and segment the novel coronavirus pneumonia using a neural network, which can quickly and accurately complete the segmentation of the lung and infected area, and help medical staff diagnose novel coronavirus pneumonia. It can also improve the screening efficiency and accuracy for pneumonia

For the segmentation of the infected area, a ResAU-Net based on the U-net network structure was proposed. The pre-training ResNet was used as an encoder to enhance the feature extraction ability of the model. The attention mechanism was added to improve the network’s attention to the region of interest, reduce the computational burden, and improve the prediction accuracy. Finally, sub-pixel convolution was used to achieve the upsampling of the feature image. The prediction accuracy of infected area in novel coronavirus pneumonia was 73.40%, indicating good segmentation efficiency.

Acknowledgments

Funding: None.

Footnote

Reporting Checklist: The authors have completed the MDAR checklist. Available at http://dx.doi.org/10.21037/atm-21-1156

Data Sharing Statement: Available at http://dx.doi.org/10.21037/atm-21-1156

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at http://dx.doi.org/10.21037/atm-21-1156). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Deng L, Zhou W, Zeng Q, et al. Clinical and chest CT imaging features of suspected and confirmed novel coronavirus pneumonia. Medical Information 2020;33:4-7.
Zhao P, Bai Y, Niu G, et al. Screening for CT performance of suspected novel coronavirus pneumonia. Journal of Inner Mongolia Medical University 2020;42:125-7.
Wang K, Kang S, Tian R, et al. CT characteristic appearances of patients with novel coronavirus pneumonia. Chinese Journal of Clinical Medicine 2020;27:27-31.
Zhang Y, Ye Y, Wang D, et al. Application of image processing and artificial neural network in pathological diagnosis of lung cancer. Chinese Journal of Thoracic and Cardiovascular Surgery 2005;04:238-40.
Shelhamer E, Long J, Darrell T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans Pattern Anal Mach Intell 2017;39:640-51. [Crossref] [PubMed]
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N, Hornegger J, Wells W, et al., editors. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Springer, Cham, 2015:234-41.
Wang L, Lin ZQ, Wong A. COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci Rep 2020;10:19549. [Crossref] [PubMed]
Xu X, Jiang X, Ma C, et al. A Deep Learning System to Screen Novel Coronavirus Disease 2019 Pneumonia. Engineering 2020;6:1122-9. [Crossref] [PubMed]
He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016:770-8.
Mnih V, Heess N, Graves A, et al. Recurrent Models of Visual Attention. 2014;3:arXiv:1406.6247.
Shi W, Caballero J, Huszár F, et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016:1874-83
Zhou Z, Siddiquee MMR, Tajbakhsh N, et al. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In: Stoyanov D, Taylor Z, Carneiro G, et al., editors. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. DLMIA 2018, ML-CDS 2018. Lecture Notes in Computer Science, vol 11045. Springer, Cham, 2018.
Oktay O, Schlemper J, Le Folgoc L, et al. Attention U-Net: Learning Where to Look for the Pancreas. 2018.
Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. 2014.
Sambyal N, Saini P, Syal R, et al. Modified U-Net architecture for semantic segmentation of diabetic retinopathy images. Biocybern Biomed Eng 2020;40:1092-109. [Crossref]
Zhu H, Miao Y, Zhang X. Semantic Image Segmentation with Improved Position Attention and Feature Fusion. Neural Process Lett 2020;52:329-51. [Crossref]
Dice LR. Measures of the Amount of Ecologic Association Between Species. JSTOR 1945;26:297-302. [Crossref]
Chen LC, Zhu Y, Papandreou G, et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In: Ferrari V, Hebert M, Sminchisescu C, et al., editors. Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol 11211. Springer, Cham, 2018.

(English Language Editor: A. Kassem)

Cite this article as: Zhang Z, Ni X, Huo G, Li Q, Qi F. Novel coronavirus pneumonia detection and segmentation based on the deep-learning method. Ann Transl Med 2021;9(11):934. doi: 10.21037/atm-21-1156

Novel coronavirus pneumonia detection and segmentation based on the deep-learning method

Introduction

Methods

Lung parenchyma segmentation

Novel coronavirus pneumonia segmentation

Encoder

Attention block

Sub-pixel convolution block

Experimental dataset

Quantitative evaluation metrics

Implementation details

Data analysis

Results

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share