Detection of COVID-19 cases’ accuracy is posing a conundrum for scientists, physicians, and policy-makers [1, 2]. As of April 23, 2020, 2.7 million cases have been confirmed, over 190,000 people are dead, and about 750,000 people are reported recovered [3]. Yet, there is no publicly available data on tests that could be missing infections. Complicating matters and furthering anxiety are specific instances of false-negative tests [4].

Meanwhile, global tests per 1 million population by country vary: USA (11,067), UK (6783), Italy (21,598), France (7103), Spain (19,896), and Canada (13,452) [3]. The accuracy of reported COVID-19 cases has never been more urgent especially given consideration of phased reopening of economies [5, 6]. Thus, to improve the accuracy of cases, we leverage a deep learning (DL) approach that helps minimize detection bias of COVID-19 scans. Deep learning (DL) is a special type of artificial neural network (ANN) inspired by the human cognition system. In recent times, DL approaches have gained enormous research attention due to their excellent ability to learn underlying patterns and features from image databases and subsequently make predictions on new and unseen data. Therefore, we predict the accuracy of COVID-19 cases based on a novel artificial intelligence (AI) algorithm that uses a deep learning model to automatically and appropriately classify X-ray chest scans of COVID-19 versus non-COVID-19 images.

Methods

We developed a deep learning model to improve the accuracy of reported cases and to precisely predict the disease from chest X-ray scans. Our model relied on convolutional neural networks (CNNs) to detect structural abnormalities and disease categorization that were keys to uncovering hidden patterns. To do so, a transfer learning approach was deployed to perform detections from the chest anterior-posterior radiographs of patients. We used publicly available datasets to achieve this [7, 8].

Deep learning

One typical application of DL in radiology practice is detecting structural abnormalities and disease categorization. In particular, convolutional neural networks (CNNs) are proven to be very effective techniques in detecting abnormalities and pathologies in chest X-ray scans [8,9,10].

Due to the recent outbreak of COVID-19 worldwide, the demand for efficient and accurate automatic detection networks has risen sharply. Chest X-ray radiography (CXR) is a commonly used imaging modality in the primary COVID-19 screening process. It is faster, simpler, cheaper, and less harmful than X-ray computed tomography (CT). However, detection of the COVID-19 disease requires manual interaction from expert radiologists and at times the structural abnormalities from the patient CXRs are not visible to the human eye [11].

Additionally, due to the growing number of patients and lack of clinical staff, accurate and automatic detection of the COVID-19 disease from the patient CXRs may be of significant value. Therefore, in this study, we developed a CNN-based model to perform COVID-19 detections from chest radiographs (Fig. 1).

Fig. 1
figure 1

Model architecture based on pre-trained weights of VGG-19 classifier model. Trainable fully connected layers (FCN) allow the network to learn the underlying patterns from CXRs with respect to the label of images given in the training dataset

Network architecture

One major limitation of training CNNs for COVID-19 detection is the lack of publicly available and expert labeled images. Therefore, in this study, a transfer learning approach was deployed to design a CNN-based model to perform detections from the chest anterior-posterior radiographs of patients. Previous studies in the literature have reported that transfer learning, despite using a different dataset, is an effective low-level feature detector and it has the potential to reduce the risk of overfitting [12, 13].

The proposed model was a modification of the VGG-19 classifier previously trained on ImageNet, a very large dataset containing more than one million images [14]. A trainable multilayer perceptron was added on top of the pre-trained VGG-19 model to train and perform detections on our datasets (Fig. 1)

Datasets

We use publicly available frontal chest X-ray images from 181 patients [7, 8]. The dataset consisted of patient scans from Italy, Taiwan, China, Australia, Israel, among other locations and was labeled as positive COVID-19 detection from expert radiologists. The images were collected from a collection of recently published papers and articles as a global effort to encourage widespread and collaborative research in COVID-19 detection [7, 8]. Images in this dataset contain additional information with regards to the patients’ age, gender, survival, and location. Moreover, the healthy chest X-ray scans used for training in this study were extracted from the public chest X-ray database provided by the NIH clinical center [8]. We randomly selected 364 images from the pool of patient scans (> 10,000) labeled as “no finding” to achieve an overall healthy-COVID-19 ratio of approximately 1:2.

Network training

To get results, first, we clone COVID-19 and non-COVID-19 X-ray scan data. The parameters of this data include patient’s ID, sex, age, diagnosis (COVID-19 or not), survival (yes or no), scan view (e.g., posteroanterior, supine anteroposterior), timestamp, and location. Next, we initialize the variables for training by setting initial learning to 0.001 for the Adam optimizer, epochs to 100, and batch size to 15. We split train, validation, and test data in the ratio of 80:20:20 (number of healthy, COVID-19 scans: train (233, 115), validation (56, 32), test (75, 34)). Prior to training, the patient scans were normalized to range (0, 1) and resized to 512 × 512 pixel dimensions. We trained the model on a Tesla K80 GPU (Nvidia, Santa Clara, USA) using early stopping and learning rate reduction on plateau.

Results

COVID-19 patients’ age ranged from 10 to 90 years; offset, that is the number of days since the start of symptoms or hospitalization for each image, ranged from zero to 35 days; just over 65 of the patients identified as male, and approximately 45 identified as female. While we acknowledge the limited sample size, readers may want to appreciate the challenges and sensitivity associated with accessing X-ray scans of COVID-19 patients. Despite these limitations, our results offer very high accuracy (96.3%) and loss (0.151 binary cross-entropy) using the public dataset consisting of patients from different countries worldwide (Table 1). As the confusion matrix below (Fig. 2) indicates, our model is able to accurately identify true negatives (74) and true positives (32); this deep learning model identified three cases of false-positive and one false-negative finding from the healthy patient scans.

Table 1 Findings from the test set using predicted by the proposed deep learning model. Precision, true positives/(true positives + false positives); Recall, true positives/(true positives + false negatives); F1-score, combination of precision and recall using the harmonic mean (overall classification accuracy).
Fig. 2
figure 2

Confusion matrix of the model predictions from the test scans (healthy (75): Covid-19 (34) cases). Proposed model achieved 105/109 correct classification (3 false positives, 1 false negative)

Discussion

Our research examines a method to improve accuracy in COVID-19 detection. The authors deploy a transfer learning approach to design a CNN-based model to perform COVID-19 detections from the chest anterior-posterior radiographs of patients. We highlight two key findings: (1) our COVID-19 detection model minimizes manual interaction dependent on radiologists as it automates the identification of structural abnormalities in patient’s CXRs and (2) our deep learning model is likely to detect true positives and true negatives and weed out false positive and false negatives with > 96.3% accuracy. In summary, our automated deep learning model reduces reliance on the human eye and offers a faster, simpler, and more accurate method for COVID-19 detection.

Our analysis has unique strengths. Our AI-based model brings forth the possibility of an automated COVID-19 detection mechanism that might impact public policy and scientific research. As we focus on detection bias at the level of individual cases and create an AI mechanism to weed out false positives and false negatives, we offer an automated alternative that reduces stress on public health infrastructure (X-ray scans). Our AI-based detection model can be rapidly deployed as it leverages small business infrastructure (X-ray scans) that maybe sitting idle due to the lockdown. Our AI mechanism bridges public and private infrastructure ends as it helps idle clinics reposition themselves as tertiary testing centers as they pivot to testing rapidly by retooling X-ray scan facilities offering a lifeline to public hospitals. The process may work as follows: (1) Patient visits these tertiary centres, reducing pressure on existing COVID-19 testing centers and hospitals; (2) scans are undertaken at tertiary centres; (3) scans land in a safe digital repository (e.g., a local university); (4) false positive and false negatives are weeded out by the AI deep learning model; and (4) preliminary analysis is sent to a government COVID-19 centre, who can recommend appropriate action.

Looking into the future, our AI-model extends support to large scale studies that aim to understand the presence of “invisible cases,” for instance, research focused on knowing how many people without obvious symptoms could be infected by the virus [15]. Our automated approach can be a cheaper, faster, and simpler mechanism to support efforts to know the proportion of the population that may have already had the virus. Scientists and health policy-makers can leverage the > 96.3% accuracy of our deep learning model as they undertake large-scale randomized testing to investigate the extent to which the virus might have penetrated not just respective societies but also high-risk clusters within a given society. An automated deep learning COVID-19 detection approach will quicken our understanding of the underlying immunity developing across communities while we make decisions about public life in a pandemic.

Nevertheless, the authors acknowledge the limitations of this research. First, our deep learning model relies on only 181 COVID-19 cases (and 115 used for training) to arrive at a conclusion regarding the accuracy of our method. It will help if we have access to more scans of COVID-19 X-rays and incorporate scans with wider demographics. Moreover, if we are looking to deploy this AI approach to make policy decisions about reopening economies, we may need access to behavioral characteristics (e.g., work, education, and leisure patterns) to understand why some people may be invisible—they do not have symptoms but have the virus. Additionally, this research has leaned upon recent literature that may still not be peer-reviewed. However, the tradeoff under such difficult times is between extensive peer-reviews and rapid publishing of early indicators. Finding a balance between the two may be more of an art than a science.

Despite these limitations, our goal is to add clarity to the scientific discussion around a crisis that most people did not see coming but now find themselves deeply enmeshed in. As we contribute to the ongoing debate on detection bias and present an AI-based mechanism before the scientific community, we hope that policy-makers will take note of alternatives such as our deep learning model to find a path out of the lockdown.