1 Introduction

The novel severe acute respiratory syndrome-related coronavirus (SARS-CoV-2) started from Wuhan, China, in December 2019 and spread to all the countries worldwide. This virus caused pneumonia of unknown cytology and is named COVID-19. This infectious disease has been classified as a public health crisis of the international community concern on January 30, 2020, because of its high infectivity and mortality. In March 28, 2020, 177 countries have been the most impacted countries worldwide with more than 600,000 cases. Italy was the second country after China with 92,472 cases. Detailed statistics about the spread of the virus in Italy can be found in Abenavoli et al. (2020).

In 2021, many variants of COVID-19 have been detected in different countries including UK, Brazil and India. The spread of these variants and the mortality rates are more important than the previous ones. Figure 1 presents the daily cases of the Indian variant reported by the COVID-19 study group in India. The lack of successful diagnosis or preventive measures has led to a rise in the number of cases, an increase in the cost of hospitalizations and palliative treatments. Therefore, scientists and medical industries around the world incited to find a prompt and accurate detection of COVID-19 for early prevention, screening, forecasting, drug development and contact tracing to save more time for the scientific community and healthcare expert to pass to the next diagnosis stage to reduce the death rate reverse transcription polymerase chain reaction (RT-PCR) is recommended to diagnose COVID-19. Additionally, there are studies in the literature using various imaging methods (computed tomography (CT) and X-ray). (Xu et al. 2020; Gozes et al. 2020; Ucar and Korkmaz 2020; Sethy and Behera 2020; Zhang et al. 2020a). It may occur in situations that negatively affect these methods. The changes of viruses by the appearance of new mutations make the classifications a more challenging task (Grubaugh et al. 2020). Moreover, one of the biggest problems with COVID-19 patients is viral pneumonia (VP). Differentiating between viral and non-viral pneumonia (nVP) is not easy. Coexistence of COVID-19 and viral pneumonia can have dire consequences.

Oxford COVID-19 Evidence Service Team Center follows some tips in identifying these problems. Muscle pain, loss of sense of smell and shortness of breath without pleuritic pain are the most common symptoms, especially in the case of COVID-19 infection. On the other hand, symptoms such as bilateral positive lung findings, tachycardia or tachypnea disproportionate to temperature and low temperature indicate other VP symptoms (Heneghan 2020). nVP, however, is most susceptible if it becomes rapidly unwell after a short period from the appearance of symptoms and does not have similar symptoms of COVID-19, pleuritic pain or purulent sputum.

Fig. 1
figure 1

Cumulative daily cases in India Group (2021)

Many studies have been introduced to solve this problem, for instance, Zhang et al. (2020b) proposed to lessen the process of anomaly detection into a one-class-classification problem using a confidence aware module. Deep learning (DL) is then used for the classification task as shown in Fig. 2. Recent reviews show that the use of novel technology with artificial intelligence (AI) and machine learning (ML) techniques considerably improves the detection, screening, contact tracing, forecasting and vaccine advance with high reliability.

Since the COVID-19 pandemic started, it has been clear that deep learning algorithms from ML technologies seem to be used extensively to detect COVID-19, VP, bacterial pneumonia (BP) and other similar cases. The advantages and disadvantages of these studies should be evaluated. In this study, it is aimed to present a detailed review on studies using DL approaches using various images in the literature to detect COVID-19. In addition, studies that detection of COVID-19 using acoustic sound data are included.

In this review paper, known databases, involving IEEE Xplore, NCBI, ScienceDirect, SpringerLink, ACM Digital Library and ArXiv, have been used to seek for coronavirus related research. Most papers are chosen from journals, conference papers and preprints using Google Scholar. Relevant X-ray and CT scan databases for COVID-19 detection have been mainly found on the Kaggle website. The articles are selected using the keywords COVID-19, Cough, Radiography Database, Coronavirus, Deep Learning, Transfer learning, pre-trained models, Detection, Diagnosis and Segmentation. The last update of the papers research using the aforementioned keywords has been made on July 30, 2021.

The rest of this paper is organized as follows: Sect. 2 presents the medical imaging technologies. In Sect. 3 we review the most important DL methods proposed to diagnose the COVID-19, as well as the recent advanced applications. Section 4 presents some additional DL applications to fight against COVID-19 such as acoustic analysis and human mobility estimation. In Sect. 5 an overall discussion and proposed solutions are presented to accurately diagnose and to reduce the spread of the COVID-19. Conclusions, future trends and challenges end the paper.

Fig. 2
figure 2

Distinguishing between VP cases (anomalies) and nVP cases and normal controls (Zhang et al. 2020b)

2 Medical imaging technologies versus RT-PCR

The medical imaging field has considerably emerged in the last years offering reliable automated methods for clinical decision making. It has received wide acceptance by the scientists and the medical community. In the case of COVID-19, CT scans and X-ray images can play a vital role in the early diagnosis of the disease. Infected patients have clinical symptoms including cough and fever; however, an important proportion of infected patients can be asymptomatic. In Germany, it has been confirmed in the study of Rothe et al. (2020) that an asymptomatic patient was able to transmit the virus to another patient. According to the study of Al-Tawfiq (2020) from 9 countries, 18 from 144 cases were asymptomatic, the equivalent of 12.5%. The study has been done using the RT-PCR test.

Due to the high risk of transmission of COVID-19, accurate diagnostic methods are urgently needed to prevent the spread of the virus and for humanity to breathe comfortably. Besides being the gold standard of the RT-PCR test, the results are time-consuming (requires 5–6 h) to obtain. In addition, the high rate of false detection of RT-PCR test is questioned whether it is a good diagnostic method. Xie et al. (2020), Long et al. (2020). In this case, it is recommended that patients with typical imaging findings should be separated ones from one another and more than one RT-PCR test should performed to avoid misdiagnosis.

The X-ray, however, is an efficient screening method; it is fast at capturing, cheaper than the RT-PCR test and largely available worldwide. CT scans, on the other hand, can be obtained much faster and more accurately in the presence of an efficient algorithm (notably DL algorithms) to accurately identify the infected patients. In Liu et al. (2019), it has been proven that DL offers highly promising results for medical diagnostics compared to healthcare professionals. Figure 3 presents the change that occurs in the COVID-19 pneumonia cases on some days. In the following section, detailed information about CT and X-ray images is presented.

2.1 Chest computed tomography

CT is an imaging method that uses a special X-ray beam to create detailed scans of areas inside the body (e.g., lungs, heart, blood vessels, airways and lymph nodes). These images are taken from different angles to generate tomographic images which give the possibility to the radiographers to directly see inside the body instead of surgery. CT images are an effective way of making clinical decisions. They showed high efficiency in diagnosing COVID-19 especially patients with false-negative RT-PCR results, assuming a role for the CT as a reliable tool for COVID-19 diagnosis during this pandemic (Li and Xia 2020; Ai et al. 2020; Xie et al. 2020; Huang et al. 2020). Therefore, the National Health Commission of the People’s Republic of China suggested CT examination in monitoring disease progression and controlling treatments of COVID-19 in its \(6^{th}\) version of the diagnosis and treatment program (Zhao et al. 2020).

Fig. 3
figure 3

CT scans in the early fast gradually stage of COVID-19 pneumonia cases. a GGO plus reticular pattern on the forth day. b GGO plus consolidation on the third day. c GGO on the second day (Zhou et al. 2020)

2.2 X-ray image

Wilhelm Conrad Röntgen has discovered the first X-ray in 1895 during experimenting with Lenard tubes and Crookes tubes. X-ray has a very important role in the medical field, it can help in the prevention of infection, diagnosis and control. X-ray scans are used worldwide to diagnose the injured part and for the detection or other diseases in order to treat patients (Ghosh and Saha 2018). The X-ray facility is available even in the remotest parts, and thus, X-ray images can be easily acquired for patients even in their home or in their quarantine location. These images have been extensively used for COVID-19 diagnosis (Narin et al. 2020). The most common reported abnormal in chest X-ray (CXR) findings are ground-glass opacities (GGOs) (Yoon et al. 2020). Figure 4 presents an example of an X-ray scans for COVID-19 and normal cases. CXR is the most widely used imaging technology by researchers because it is easily available and inexpensive.

Fig. 4
figure 4

An example of X-ray images for a COVID-19 patient (left) and normal case (right)

Fig. 5
figure 5

Taxonomy of deep learning-based approaches for COVID-19 diagnosis

3 Deep learning approaches in the COVID-19 pandemic

DL is a subset of ML that offers considerable power for improving the accuracy and speed of diagnosis by automating the screening through medical imaging in collaboration with radiologists and/or physicians. Subsequently, it has received wide acceptance and interest by the medical community leads to emphasizing the development of such diagnostic technologies (Liu et al. 2019). For example, Cinaglia et al. (2018) have presented initial results on a framework for the acquisition and decomposition of DICOM images. The tests have been conducted on a dataset from University Hospital of Catanzaro for segmentation and anatomical features extraction from lungs. Information entropy is another image analysis technique in cryptographic applications; it has been applied in the context of COVID-19 as presented in Javan et al. (2021).

In the following, we review the most important DL approaches adopted to diagnose the pneumonia of COVID-19 since its spread in December 2019 until today. Figure 5 presents the taxonomy of these approaches using different images and acoustic features. In the following, we detail each of the distinguished nine groups:

3.1 Generic deep learning

Generic DL methods without any specific modification have been proposed to detect COVID-19. For example, Wang et al. (2020d) have used CT images of 5372 patients from 7 different cities in China to train a deep neural network (DNN).

Pneumonia Detection Challenge dataset (RSNA) is used in Luz et al. (2020) to train a DL model in order to locate lung opacities on chest radiographs. RSNA dataset contains two classes: normal and pneumonia (non-normal). The total of 16,680 images have been used from this data set where 8066 are from healthy class (normal), whereas 8614 are classified as pneumonia.

The authors in Song et al. (2020) collected from two hospitals of in China the CT images of 88 infected patients (COVID-19) and 101 patients diagnosed with bacteria pneumonia, where the rest are healthy (86 persons). Using this dataset, they applied a DL-based CT diagnosis system namely: DeepPneumonia to localize the principal lesion features, especially GGO and thus to identify the infected patients. The first step is the segmentation of the lung region. Next, they introduced the DRE-Net (detailed relation extraction neural network) to draw the top-K features in the CT images and to receive the image-level predictions. Finally, the image-level predictions are used to diagnose the patient.

Another generic DL framework is proposed in Zhang et al. (2020) to automatically extract and analyze regions with high possibility to be infected with COVID-19. To do so, the authors applied a segmentation stage using a DL-based technique. Then, the infected regions were processed and quantized using specific metrics in the CT scan.

We can also find generic convolutional neural networks in which the authors use the generic CNN trained with their datasets without any combination with other ML algorithms or pretrained models. For example, in Fu et al. (2020), the authors trained the CNN model with the data collected from Wuhan Jin Yin-Tan hospital in order to classify the CT images into one of the five following classes: healthy lung, COVID-19, pneumonia, non-COVID-19 VP, BP and pulmonary tuberculosis.

3.2 Transfer learning

Transfer learning (TL) is a ML technique in which a trained model for one task is redesigned in a related second task (see Fig. 6). This approach is explicitly useful when there are not sufficient datasets like in the case of COVID-19 in order to either reduce the necessary fine-tuning data size or improve performance. TL can be used in two scenarios: supervised (with labeled data from the target domain) or unsupervised (without any labeled data from the target domain: the pretraining process is supervised, but unsupervised during fine-tuning). A DNN is proposed in Jaiswal et al. (2020) to detect COVID-19 using X-ray images. To do so, the authors applied a TL approach on the deep Pruned EfficientNet model. Then, it has been interpolated by post-hoc analysis to be able to explain the obtained predictions. TL based-framework for the detection of pneumonia is proposed in Chouhan et al. (2020). The features have been extracted from X-ray images using five different pretrained models: DenseNet121, ResNet18, GoogLeNet, AlexNet and InceptionV3. Next, an ensemble model has been added to combine outputs from all pretrained models. The obtained results are as follows: accuracy of 96.4% and recall of 99.62% on non-trained data from the Guangzhou Women and Children’s Medical Center database.

Fine-tuned deep TL with generative adversarial network (GAN) is presented in Khalifa et al. (2020) to learn a limited dataset and to avoid the overfitting problem. To do so, the authors applied the pretrained models: Squeeznet, AlexNet, GoogLeNet and Resnet18 as deep TL models to detect pneumonia from chest X-rays. Applying a combination of GAN and deep transfer models enhanced the accuracy of the proposed system and realized 99%. After applying image preprocessing algorithms to the chest X-ray images to identify and remove diaphragm regions, the pretrained VGG-16 model (Simonyan and Zisserman 2014) has been fine-tuned in Heidari et al. (2020) using the obtained images to predict COVID-19 infected pneumonia. Another work proposed in Apostolopoulos and Mpesiana (2020) to detect the COVID-19 in small medical image datasets. To do so, they worked with two different data sets from public databases. In the first dataset, there are 224 COVID-19, 700 BP and 504 normal X-ray images. The second dataset includes 224 COVID-19, 714 BP and VP and 504 normal X-ray images. They obtained 96.78% accuracy, 98.66% sensitivity and 96.46% specificity performance values.

Multi-channel TL-based method with X-ray images have been proposed in Misra et al. (2020). Multi-channel pretrained ResNet model is then used to perform the diagnosis of COVID-19. To classify the X-ray images on a one-against-all strategy, three ResNet models have been retrained. The three allowed classifications are: (1) normal or diseased, (2) pneumonia or non-pneumonia and (3) COVID-19 or non-COVID-19 individuals. The method achieved a precision of 94% and a recall of 100%. Other TL-based methods can be found in (Minaee et al. 2020; Maghdid et al. 2020; Benbrahim et al. 2020; Haghanifar et al. 2020; Abbas et al. 2020; Rahaman et al. 2020; Perumal et al. 2020; Loey et al. 2021).

Fig. 6
figure 6

An example of transfer learning process for COVID-19 detection

3.3 Data augmentation and generation techniques

Recently, generative adversarial networks (GANs) are considered the most powerful and successful method for data augmentation. Since the outbreak of COVID-19 is recent, it is difficult to gather a significant amount of radiographic images and datasets in such a short time. Therefore, DL networks, especially CNNs, need additional training data to overcome this problem and to enhance the efficiency of CNN in detecting COVID-19 (see Fig. 7). Various methods have been applied the GANs for this reason. For instance, in Waheed et al. (2020), authors generate more X-ray images using Auxiliary Classifier Generative Adversarial Network (ACGAN) based on the CovidGAN model. Accordingly, the classification accuracy has been significantly enhanced from 85% using the CNN alone, to 95% using the ACGAN with CovidGAN model.

Also, to handle the problem of the lack of datasets for COVID-19, Loey et al. (2020) proposed classical data augmentation techniques along with Conditional GAN (CGAN) on the basis of a deep transfer learning model for COVID-19 detection using CT images. Similar representation has been used in Loey et al. (2020) to classify the CT images into the following four classes : the COVID-19, normal, pneumonia bacterial and pneumonia virus. To do so, the authors have used a dataset of 307 images. Three deep transfer models are then carried out in this work for investigation. The models are the GoogLeNet, AlexNet and ResNet18. Three strategies have been conducted; in each strategy, the authors applied a different deep TL using the three pretrained models mentioned above. The testing accuracies achieved by the GoogLeNet, AlexNet and ResNet18 are 80.6%, 85.2% and 100%, respectively.

Another method proposed in Karakanis and Leontidis (2020) aims to generate synthetic medical images using DL CGAN to overcome the dataset limitation that leads to over-fitting. The proposed model has been implemented in a form to support a lightweight architecture without transfer learning without performance degradation. It can deal with any non-uniformity in the data distribution and the limited accessibility of training images in the classes. It consists of a single convolutional layer with filter size 32 and kernel 4 \(\times \) 4, followed by ReLU activation function and Max Pooling layer for down-sampling the image (input representation) and enabling feature extraction. After a flatten layer there exists a dense layer of size 128, followed by dropout and a final dense layer with softmax activation function for a binary output.

GANs can also be used during the training step along with pretrained models. For example, authors in Ghassemi et al. (2021) have proposed to use CycleGAN for unpaired image-to-image translation. It consists of training two generator discriminators concurrently. The following pretrained models have been employed: ResNet, EfficientNet, Densenet, ViT and ResNest. To generate data, the first generator takes the first part of images as input and generates images for the second part. Next, the second generator applies the inverse. The discriminator models will then separate the generated images from origin ones. Finally, it feeds the gradients to the generators. This study has proven that CycleGAN could be successfully used for COVID-19 diagnosis by data augmentation to enhance the accuracy of the pretrained CNNs.

Since supervised learning-based techniques is a very difficult task because of the scarcity of large amount of labeled data, GANs can be used as a semi-supervised learning method to solve this problem for COVID-19 detection. For example, Alizadehsani et al. (2021) used GAN discriminator output for classification. Learning from a merged small amount of labeled data and unlabeled data is carried out. This semi-supervised method gives better performance compared to supervised learning of CNNs. Other methods based on data augmentation to detect the COVID-19 can be found in Ahmed et al. (2021).

Fig. 7
figure 7

Generative adversarial network representation for COVID-19 detection

3.4 Autoencoder-based models

Another ML technique to handle the problem of insufficient data for the affected COVID-19 cases is the autoencoder (AE) (Baldi 2012). It is a deep learning method that contains two effective data encoder and decoder employed for unsupervised feature learning. The AE models are composed of 2 main steps: encoding and decoding. The entries of this model are mapped to a reduced space dimension while ensuring an accurate feature representation within the encoder. In the second step, the decoding consists of reverting samples to their initial space by generating data from the reduced space representation. Figure 8 shows the structure of autoencoder and its two main steps. The advantage of adopting such unsupervised classification to handle the problem of COVID-19 detection compared to its counterpart (supervised classification) is to avoid the long time spent in assembling large amounts of data which could increase the risk of mortality and postpones medical care. For example, in Khoshbakhtian et al. (2020), the authors introduced the COVIDomaly which aims to diagnose new COVID-19 cases using a convolutional autoencoder framework. They tested two strategies on the COVIDX dataset acquired from the chest radiographs by training the model on chest X-rays: the first strategy used only healthy adults, the second tested healthy and BP, and infected adults with COVID-19. Using 3-fold cross-validation, they obtained a pooled Receiver Characteristic Operator-Area Under the Curve (ROC-AUC) of 76.52% and 69.02% with the two strategies, respectively.

In Goel et al. (2020), the authors extracted discriminative features from the autoencoder and gray-level co-occurrence matrix using CT images. The obtained features are then combined with random forest classifier for COVID-19 detection. They achieved the following results: accuracy of 97.78%, specificity of 98.77% and recall of 96.78%.

Autoencoder can also be applied for survival chance detection of patients. As an example, Khozeimeh et al. (2021) combined CNN with autoencoder to predict survival chance of patients with positive COVID-19 diagnosis. In their experiments, clinical data are used such as blood pressure, liver disease, etc. The main problem facing this work is the data imbalance since the majority of infected patients are recovered (less mortality rate). To overcome this problem, data augmentation method is applied based on AE. Other autoencoder-based methods can be found in Berenguer et al. (2020), Shoeibi et al. (2020), Khobahi et al. (2020).

Fig. 8
figure 8

An example of autoencoder model for COVID-19 detection

3.5 Pretrained deep neural networks

Pretrained models were originally trained on existing large-scale labeled dataset (e.g., ImageNet) and later fine-tuned over the chest CT and X-ray images to accomplish the diagnosis process. The last layer in these models has been removed, and a new fully connected (FC) layer is added with an output size of two that represents two separate classes (COVID-19 or normal). In the obtained models, only the final FC layer is trained, while other layers are initialized with pretrained weights (Nayak et al. 2020). These models can be a very useful solution to the lack of large datasets for COVID-19. However, some challenges exist. One of the risen problems here is that the transfer across datasets from a domain to another can lead to deterioration of performance due to the gap existing between the domains. This is often the case with medical images taken from different centers. Moreover, there is an over-fitting problem with small amounts of COVID-19 datasets. Therefore, pretrained models are generally used with some particular modifications in order to avoid the over-fitting problem.

In Nayak et al. (2020), eight pretrained CNN models have been compared including GoogLeNet, AlexNet, MobileNet-V2, VGG-16, SqueezeNet, ResNet-50; ResNet-34 and Inception-V3. The obtained results for the classification of COVID-19 from normal cases show that ResNet-34 outperformed the other pretrained models and achieved an accuracy of 98.33%. This evaluation has been conducted on a total of 286 scans of COVID-19 and normal classes as a training set, and 120 scans for the test (60 scans for each class). When dealing with the augmented dataset, the total of the training scans is 1002, where 428 scans are used for the validation and 120 scans for the test. ResNet-18 has been applied in Oh et al. (2020) using limited training datasets and achieved a sensitivity of 100% and 76.90% of precision for the COVID-19 class. Figures 9 and 10 show an example of AlexNet and VGG-16 architectures respectively. DenseNet201 pretrained model is used in Jaiswal et al. (2020) on chest CT images. To classify the patients into positive or negative COVID-19, deep transfer learning is carried out and obtained a training accuracy of 99.82% and validation accuracy of 97.48%. The extreme version of the Inception (Xception) model is applied in Das et al. (2020) and achieved an accuracy of 97.40%, f measure of 96.96%, sensitivity of 97.09% and specificity of 97.29% for three classes COVID-19, pneumonia and other diseases.

Hemdan et al. (2020) proposed COVIDX-Net framework to diagnose the COVID-19 cases using X-ray images. It includes three main steps to accomplish the diagnostic process of the COVID-19 as follows: preprocessing, training model and validation, and classification. In consequence of the absence of public COVID-19 datasets, the experiments are carried out on 50 chest X-ray images where only 25 have been diagnosed with COVID-19, for the validation. The COVIDX-Net combined the following seven different architectures of deep CNN models: VGG-19, DenseNet121, InceptionV3, ResNetV2, Inception-ResNet-V2, Xception and the second version of Google MobileNet. Figure 11 presents an overview of the COVIDX-Net framework. Another model called InstaCovNet-19 makes use of five pretrained models including Xception, ResNet101, InceptionV3, MobileNet and NASN is proposed in Gupta et al. (2020). Two classification strategies have been conducted : (COVID-19, pneumonia, normal) and (COVID, Non-COVID). Very high precision and classification accuracy have been achieved using the two strategies (see Table 2).

Similar method has been proposed by Narin et al. (2020) using five pretrained CNNs and three three different binary datasets including COVID-19, normal (healthy), bacterial and viral pneumonia patients. Gour and Jain (2020) proposed a new CNN model based on the VGG-19. They used a 30-layered CNN model for the training with X-ray images, and obtained sub-models using logistic regression. Other methods using pretrained CNNs can be found in Makris et al. (2020), Afshar et al. (2020), Kumar et al. (2020).

Fig. 9
figure 9

AlexNet architecture proposed in Krizhevsky et al. (2017)

Fig. 10
figure 10

VGG-16 architecture proposed in Simonyan and Zisserman (2014)

Fig. 11
figure 11

COVIDX-Net framework Hemdan et al. (2020)

3.6 3D convolutional neural networks

3D CNN models have also been used in the literature. They mainly extract 3D features from the segmented 3D lung region using CT images. For example, Wang et al. (2020e) segmented the lung region a pretrained UNet model. The obtained volumes were fed into the proposed DeCoVNet (3D deep convolutional neural Network to Detect COVID-19). A weakly supervised classification is then applied and achieved high COVID-19 classification performance and good lesion localization results. Müller et al. (2020) have also used UNet model instead of computational complex CNNs to reduce the over-fitting problem during the segmentation of the infected lung region. In Wang et al. (2020a), 3D-ResNet is applied for end-to-end training to classify the acquired lung images into pneumonia or healthy.

In order to predict the risk of COVID-19, Yang et al. (2020a) applied end-to-end training from CT images using the 3D Inception V1 model pretrained on the ImageNet dataset. The obtained accuracy was 95.78% overall and 99.4% on a part of the dataset containing 1,684 COVID-19 patients. Li et al. (2020) introduced a 3D DL system that aims to early detect the COVID-19, called COVNet. The COVNet model is composed essentially of ResNet50, which have a range of CT scans as entry and produces features for the equivalent scans. The obtained features from all scans are then involved by a max-pooling process. The final feature map is used as an input to a fully connected layer and softmax activation function to produce an output of a likelihood result for the three classes: COVID-19, non-pneumonia and community-acquired pneumonia (CAP). Han et al. (2020) introduced a deep 3D multiple instance learning to detect the COVID-19 using CT images. High accuracy has been achieved (97.9%) and AUC of 99.0%. Other 3D CNN-based methods can be found in de Vente et al. (2020), Liu et al. (2020).

3.7 Combination of generic CNNs with traditional ML algorithms

Another strategy is to use CNN models differently by combining them with traditional ML algorithms.

In Stephen et al. (2019), the authors presented a CNN model trained on X-ray images to recognize pneumonia. The proposed architecture consists of a combination of the convolution, max-pooling and rating layers. The obtained features comprise four convolutional layers, a max-pooling layer and a RELU activator between them. The traditional ML algorithm ANN (artificial neural network) is finally applied for classification. ANN and AlexNet architecture have been combined in Aslan et al. (2020) to systematically find out COVID-19 pneumonia subjects using CT scans. Firstly, a segmentation using ANN algorithm is performed to localize the lungs. Next, COVID-19 classes are augmented to produce more images. Finally, pretrained AlexNet architecture is used in one time with only a transfer learning process, the obtained accuracy is 98.14%. And with additional layer called Bidirectional Long Short-Term Memories in the second time, with an accuracy 98.70%, Nour et al. (2020) proposed a scratch CNN model including five convolution layered serial network. Three ML algorithms have been trained on the obtained deep features involving k-NN, SVM and DT. The highest accuracy is obtained by the SVM with 98.97%.

Instead of using pretrained deep CNNs only as feature extractor, in Ismael and Şengür (2020), two other strategies have been conducted to accurately classify Chest X-ray images into positive of negative COVID-19 including fine-tuning strategy and end to end training. The following models have been used as a feature extractor : ResNet101, VGG19, ResNet50, ResNet18 and VGG16 where SVM is used for ML-based classification, whereas a new CNN model is used for the fine-tuning strategy. Finally, end-to-end training with a dataset of 180 COVID-19 and 200 normal is carried out as a third strategy. 94.7% of accuracy is achieved using ResNet50 model and SVM classifier, where fine-tuned strategy with ResNet50 model achieved 92.6%. Finally, the end-to-end training strategy of the developed CNN model realized a 91.6% result. Deep CNN and long short-term memory (LSTM) have been combined in Islam et al. (2020) to diagnose COVID-19 automatically from X-ray images. The obtained accuracy of the classification of three classes (COVID-19, normal and other pneumonia) is 99.4%. Similar methods that combine deep features and classical ML techniques can be found in Sethy and Behera (2020).

Sharifrazi et al. (2021) proposed to improve the 2D-CNN’s performance by applying a Sobel filter to detect COVID-19 from a new collected dataset of X-ray images. An SVM algorithm is then employed to classify the input images. This proposed method has shown a high performance with not many data.

3.8 Ensemble models

Handling the problem of COVID-19 detection using a single DL model without any specific addition might not achieve a high accuracy classification using CXR images or CT scans. For this reason, the use of many DL models combined with each other can be a good solution, namely ensemble model, and the learning approach is called ensemble learning. For example, the authors in Sitaula and Hossain (2020) proposed a DL model, namely attention-based VGG-16. This model used VGG-16 to capture the spatial relationship between the ROIs in CXR images. By using an appropriate convolution layer (4th pooling layer) of the VGG-16 model in addition to the attention module, they added a novel DL model to perform fine-tuning in the classification process. In Hall et al. (2020) ensemble of three pretrained models including ResNet50 and VGG16 and an own small CNN is applied for a test set of 33 new COVID-19 and 218 pneumonia cases. The overall accuracy realized is of 91.24%. Shalbaf et al. (2020) an ensemble deep transfer learning system with 15 pretrained CNN architectures on CT images. They obtained the following results: accuracy (85%), precision (85.7%) and recall (85.2%).

3.9 Smartphone applications

To further automate the screening of COVID-19 and to make it faster, mobile phones can be a very interesting framework for that due to their facility and numerous sensors with important computing proficiencies. Specifically, a smartphone has is able to scan CT images of COVID-19 patients to use them for analysis screening. Moreover, multiple CT images of the same COVID-19 patient can be gathered into one smartphone for similarity examination of how disfigurement have been developed (Purswani et al. 2019). However, the computing capability of a mobile to treat a large amount of data is lower than a grand machine or a computer. Therefore, a lightweight representation is needed to accomplish this task. Consequently, various recent methods have been proposed to detect COVID-19 in mobile devices using a slight representation. In Zulkifley et al. (2020), a lightweight DL model namely LightCovidNet has been offered to detect COVID-19 using a mobile platform. To enhance the performance of the proposed model, supplementary data have been generated and added to the training dataset using the conditional deep convolutional GAN. In order to reduce the memory usage of the proposed model, five units of feed-forward CNN are built using separable convolution operators. Multi-scale features are then learned to be suitable for the X-ray images which have been acquired from all over the world separately. Instead of COVID-19 diagnosis and detection, various lightweight applications have been introduced to delay the spread of the virus. These applications could be designed to be compatible with the capabilities of a smartphone to further speed up their operation. Among these applications, we can find masked face recognition (Hariri 2020), facial mask detection (Chen et al. 2020; Chua et al. 2020), social distance monitoring (Ahmed et al. 2020; Rezaei and Azarmi 2020) and human mobility estimation (Xiong et al. 2020). Other mobile-based technique using to fight against COVID-19 has been proposed in Maghdid et al. (2020). Using DL algorithms, the authors arrived to efficiently evaluate the level of pneumonia and thus to determine whether it is a COVID-19 case or not.

4 Deep learning for other applications

Other applications have been proposed to fight against COVID-19-induced pneumonia; the most important are cough detection and human mobility estimation.

4.1 Cough detection

In addition to the DL approaches using X-ray and chest CT scans for COVID-19 detection, scientists affirm that audio sounds generated by the respiratory system can be diagnosed and analyzed to decide whether the patient is infected or not. Therefore, cough analysis has been used to screen and diagnose COVID-19. ML techniques can supply useful cues enabling the development of a diagnostic instrument. To do so, cough data of COVID and non-COVID are required. Accordingly, Sharma et al. (2020) proposed a database called Coswara, of respiratory sounds, namely cough, breath and voice. Some experiments have been recently carried out to screen COVID-19 from acoustic features; for example, in Hassan et al. (2020), recurrent neural network (RNN) has been used in its new architecture, namely the Long Short-Term Memory (LSTM) to extract six speech features from a collected dataset (i.e., spectral centroid, spectral roll-off, zero-crossing rate, zero-crossing rate, MFCC and MFCC). In this work, 70% of the data were used for training and 30% for testing. The obtained results show that the best accuracy is achieved for breathing sound, reaching up to 98.2% followed by cough sounds; an accuracy of 97% is attained, whereas the accuracy of voice analysis is of 88.2%.

A smartphone application using cough-based diagnosis for COVID-19 detection is proposed in Imran et al. (2020). This application is based on an AI-powered screening solution called AI4COVID-19. Its principle is to send three 3 second cough sounds to an AI engine running in the cloud and give a result during two minutes. To overcome the lack of COVID-19 cough training data, the authors applied transfer learning using ESC-50 dataset (Piczak 2015) that contains 50 classes of cough and non-cough sounds acquired using a smartphone. Figure12 presents the offered system architecture and a drawing of the AI4COVID-19. The obtained results show high overall accuracy of 95.60%, a sensitivity of 96.01%, a specificity of 95.19% and precision of 95.22%. Schuller et al. (2020) studied what computer audition could possibly contribute to the ongoing battle against the COVID-19. Other recent acoustic analyses for the detection of COVID-19 can be found in Deshpande and Schuller (2020), Alsabek et al. (2020), Pal and Sankarasubbu (2020), Quatieri et al. (2020), Laguarta et al. (2020), Deshmukh et al. (2020).

Fig. 12
figure 12

AI4COVID-19 system architecture Imran et al. (2020)

4.2 Estimating human mobility

Human mobility (movement) is one of the main factors that promote the transmission of the virus. Policy-makers find huge difficulties to find an optimal protocol to insure the social distancing and barrier measures. To solve this problem, Bao et al. (2020) proposed a system that aims to evaluate and estimate maps of people movement responses by learning from existing ground truth data. The proposed system is based on a DL based-data generation called COVID-GAN. It merges a diversity of features involving contextual features, COVID-19 details and data history, as well as policies from various origins such as news, reports and SafeGraph. Experiment results showed that COVID-GAN can well imitate real-world human movement reactions and the area-constraint-based correction can considerably upgrade the solution value. To further explain the relation between people mobility and COVID-19 contamination, Xiong et al. (2020) presented a study using mobile device data to give more insights to decision makers about the national mobility tendencies before and during the pandemic.

4.3 Forecasting of new cases and new deaths

Along with the detection approaches presented above, forecasting of new cases and new deaths has been studied recently in order to help the governments take right decisions during this pandemic. Therefore, COVID-19 time series predicting have been studied using DL techniques, one of the most used models is LSTM that has been introduced by Hochreiter and Schmidhuber (1997) and applied by Wang et al. for COVID-19 time series prediction (Wang et al. 2020c). LSTM is designed to combine the short-term and long-term temporal data and provides accurate time series forecasting. For example, Ayoobi et al. (2021) have used LSTM, convolutional LSTM and GRU to predict new cases and new deaths during the COVID-19 crisis. Bi-directional models have been employed for the prediction; it is considered as an improved LSTM. It consists of using the traditional LSTM model to compute the input information at subsequent and reverse order to obtain 2 different exterior outputs and get the final output using a fully associated layer.

5 Available COVID-19 datasets description

Since the spread of the virus from Wuhan to many countries in the world, many COVID-19 datasets have been introduced in the literature in order to apply the deep learning approaches for automatic detection of the virus. In the following, we give the newly available datasets where the images are X-ray or CT scans.

5.1 X-ray datasets

Many X-ray datasets have been presented to boost the DL techniques for COVID-19 detection. Here, we present two examples.

COVID-19 Radiography Database the initial dataset is divided into three different folders (i.e., training, testing and validation) and three sub-folders containing COVID-19, viral pneumonia and normal chest X-ray images, respectively (Rahman 2020). We have used the version 4 of this database that contains 3616 COVID-19-positive cases along with 10,192 normal, 6012 lung opacity (non-COVID lung infection) and 1345 viral pneumonia images. In this work, we carry out the classification of only three classes (COVID-19, normal, VP). Some scans from this database are shown in Fig. 13.

Covid chestxray dataset masterCohen et al. (2020) is a new and challenging dataset presented to assess the COVID-19 detection systems. It is one of the first introduced datasets during the COVID-19 pandemic. This dataset contains 5856 X-ray images divided into three parts (train, test and validation). The train part contains 1341 normal images and 3875 pneumonia, whereas the test part contains 234 of normal cases and 390 pneumonia. We can notice that this dataset unbalanced, this makes it more challenging since more adjustments are required to avoid the biased class problem. The chest X-ray images are acquired from hospitals with a frontal view. Considering the Covid chestxray dataset master as a benchmark is very tricky because of its sort of composition. Most of the proposed methods using this dataset for two-class classification problem. Some images of COVID-19 cases from this database are shown in Fig. 14.

Fig. 13
figure 13

X-ray images from COVID-19 Radiography Database Rahman (2020)

Fig. 14
figure 14

X-ray images diagnosed COVID-19 from Covid chestxray dataset master Cohen et al. (2020)

5.2 CT scan datasets

CT scan datasets have been mostly introduced to segment specific regions in the thorax in order to diagnose COVID-19 patients. Here, we present one of these datasets.

The COVID-CT-Dataset contains 349 CT scan images diagnosed COVID-19 from 216 patients (Yang et al. 2020b; Zhao et al. 2020). The authors have based on many repositories (i.e., medRxiv2 and bioRxiv3) to extract these images from 760 preprints. To associate each image to the right class (e.g., COVID, PV, normal), the captions of the corresponding figures are used along with PyMuPDF4 to get low-level structure data. The quality of the images extracted from the PDF files has been maintained. In the case of not clear caption, the meta-data of the paper has been used to associate the images to its label. Some images of COVID-CT-Dataset are presented in Fig. 15.

Fig. 15
figure 15

CT scans from COVID-CT-Dataset

CT Scan COVID Prediction contains 746 CT scan images divided into train set (500 images) and test set (246 images) (He et al. 2020). These images have been collected from many papers in the context of COVID19 prediction and segmentation from ArXiv, medRxiv, NEJM, etc. This database is suitable for deep learning methods and enables easy diagnose of COVID-19 using CT scans.

Other X-ray and CT scan databases can be found in Table 1.

Table 1 Databases description
Table 2 Summary of DL-based methods for COVID-19 pneumonia classification. C.V refers to cross-validation, Tr refers to training, Val refers to the validation, AE refers to autoencoder. Sens refers to the sensitivity. Spec refers to the specificity. Acc refers to the accuracy

6 Discussion

Table 2 includes some studies conducted with DL models. It can be seen in Table 1 that studies have focused on two popular images, namely CT and X-ray images. The most used of these is X-ray images. This is because it is easily available everywhere. In addition, both its low memory space and high results in its performance encouraged researchers to use X-ray images. In addition, the greater availability of X-ray data from COVID-19 patients in public databases has led to the large number of these studies. In articles in the field of medicine, it is often stated that CT images show higher performance. However, these high accuracies are not seen in DL based CAD systems. This may be because radiologists can easily distinguish patients from CT images. CT images have too many cross-sections of the same person. They are much more complex than X-ray images. This complex situation is a disadvantage in distinguishing them in DL methods. The combination of X-ray and CT images, performed in few studies, also shows that it gives good results. It can be said that studies are carried out with multi-class solutions rather than the binary classification problem. It is seen that studies carried out with 3D data have lower performance. It has mostly been studied with 2D data. The results obtained from these data are very high. Considering the number of data, it is known that DL models work stably with a lot of data. According to this fact, the number of data used is not sufficient. It can be stated that this is one of the biggest problems in the detection of COVID-19. It is very important to compare the studies conducted with the data obtained from many different centers. Otherwise, the accuracy of the studies can be fooled. For this, we recommend that data collected from many different centers be offered to researchers from a single center. However, in studies conducted with a small number of data, it is seen that pretrained models are used to ensure high model training. Especially, Keras, TensorFlow, PyTorch and MATLAB also include these pretrained models. The biggest problem here is that these models are trained with the ImageNet dataset. Having a lot of data belonging to very different classes in the ImageNet dataset can reduce the trust in these models. However, it is seen that the performances obtained are also very high.

Traditional ML algorithms are also used by using the feature extraction part of DL models. This approach appears to improve performance. It is generally seen that the SVM algorithm is used. Most of the studies do not use any cross-validation (CV) method. We think that this decreases the reliability of the results. Because it is not known how the test data are distinguished. A very high performance test dataset can be created. This overestimates the results obtained. It can be noted that a CV method should be used in which the whole data set can be tested in studies. Although there are many studies using DL-based methods, it is very difficult to produce sufficiently transparent, stable and reliable models. It was clearly stated above that there are many parameters affecting the results.

As an alternative to the studies performed on X-ray and CT images, researches on the detection of COVID-19 are carried out with sound and cough-based acoustic sound analysis. Alternative approaches are very important in the detection of COVID-19. It may be possible to save more lives with real-time detection and diagnosis systems with online scanning systems installed with mobile or computer. As a result of all these approaches, namely RT-PCR test, DL-based X-ray and CT images, detection and acoustic examinations, it is predicted that patients with COVID-19 can be detected more stable and with higher accuracy. Because, considering that the epidemic started with a person, it is very important to correctly diagnose even a person.

7 Future challenges on the detection of COVID-19

Despite the good performances achieved by the deep learning techniques along with X-ray images and CT scans to detect the virus, there are still many challenges to be addressed. Figure 16 presents the number of modalities used by the reviewed papers. It is clear that X-ray images are the most used and the more available. Nevertheless, it is clear that very few researchers have used combined modalities of X-ray and CT images due to the absence of such a comprehensive database. Moreover, these images don’t provide any additional information about the patients like gender and age. Incorporating this information along with manual or ground truths, however, could improve the efficiency of DL-based methods.

Forecasting research is also very limited and will be more challenging especially because the virus mutations are not known. Hence, predicting its spread, mortality and symptoms as well as its relation with the weather are still very tricky tasks.

Moreover, since COVID-19 detection is becoming a hot research area by multidisciplinary research, a large amount of heterogeneous data including results and analysis are increasingly available. These data are generally acquired from different sources (e.g., clinical, behavior, physiology and pharmacology data). In order to make this data useful, it should be analyzed to extract knowledge. This is one of the most challenging tasks that could be solved using data mining and machine learning techniques. Cinaglia et al. (2019) present some solutions to the aforementioned challenge in life science to extract knowledge using machine learning and data mining algorithms for data analysis.

Also, when policy-makers and citizens are making their best to submit to the difficult constraints of lockdown and social distancing, AI can be used to create more intelligent robots and autonomous machines to help health workforce and to reduce their workload by disinfection, working in hospitals, food distributing and helping the patients. The challenge of this solution is that people lack confidence in autonomous machines and prefer to be served by a human even if there is a risk of virus transmission. Moreover, entrusting chatbots to diagnose patients needs a large amount of medical data from experts.

Besides, the difference in languages from a country to another makes an already difficult task still more arduous. On the other hand, when dealing with voice analysis, there are still many challenges to be taken up. For example, until now, annotated data of patients’ voices are not publicly available for research purposes of COVID-19 detection and diagnosis. Collecting these data is mostly made in unconstrained environments (i.e., in-the-wild) using smartphones or other voice recorders. These environments are generally noisy and contain reverberation, which leads to bad quality of data and makes the diagnosis and detection of COVID-19 more challenging. Also, one of the most important future challenges is to concentrate on further decreasing the false negative rate and, as far as practicable, reducing the false positive rate by the same token to accurately differentiate viral from BP.

Although environments like Google Colab and Amazon have relatively solved the lack of powerful hardware to implement deep learning architectures with GPUs Graphic Cards, enrolling these environments in real medical laboratories is still a very challenging task.

Fig. 16
figure 16

Number of modalities used to detect or predict COVID-19 using deep learning methods. The rates are computed from the reviewed papers

Fig. 17
figure 17

Number of pretrained models used to detect or predict COVID-19. The rates are computed from the reviewed papers

8 Conclusion

Although the RT-PCR test is considered the gold standard for COVID-19 diagnosis, it is time-consuming to make a decision because of high false-negative levels in the results. Therefore, medical imaging modalities such as chest X-ray and chest CT scans are the best alternative according to scientists. Chest X-ray radiography is of low cost and low radiation dose, it is available and easy to use in general or community hospitals. This review presents a detailed study of the existing solutions that are mainly based on DL techniques to early diagnose the COVID-19. This study gives more of an insight into the scientists’ and decision-makers’ thought processes—not only during the wave periods but also during that of the vaccination that could require real-time mass testing. The lack of data, however, is the mandatory problem to achieve efficient and real-time results. Many solutions have been presented and discussed in this review study to give more ideas to future trends and also for eventual future diseases that might suffer from the missing-data problem. We believe that with more public databases, better DL based-approaches can be developed to detect and diagnose the COVID-19 accurately.

According to the statistics provided in Shoeibi et al. (2020), the majority of research are devoted to the detection of COVID-19 where due to a lack of publicly accessible databases, limited research are done on forecasting. Also, from this review paper, we can notice that many pretrained CNNs have been employed to deal with COVID-19 detection or prediction as shown in Fig. 17; it is difficult to make a fair comparison between their performance since their methods apply different protocols and combine many different datasets to enrich the DL model. Accordingly, although transfer learning these pretrained models has shown very high performance, it is not easy to decide which one is more suitable for COVID-19 detection or forecasting.