Introduction

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the virus which causes coronavirus disease 2019 (COVID-19), was first reported in Wuhan, China, in December 2019 [1]. Since then, the virus spread around the world and on February 27, the first patient with COVID-19 was identified in the Netherlands. As of the 1 May, 39,791 Dutch patients tested positive for COVID-19, 10,854 patients were admitted to hospitals, and 4893 patients died due to SARS-CoV-2 [2]. Although the Dutch government enforced measurements to enter a mitigation phase, to prevent a peak of demand which could exceed the capacity of hospital treatment, increasing numbers of patients visited the emergency department (ED).

Patients suspected of COVID-19 are recognized and triaged mainly based on presenting clinical characteristics and symptoms [1, 3]. The definitive diagnosis is performed by real-time reverse transcription polymerase chain reaction (RT-PCR) on samples obtained from oropharyngeal and nasopharyngeal swabs [4]. Recent studies reported that the gold standard, RT-PCR has a sensitivity of 59–71% when compared with repeated testing with the same RT-PCR method [5, 6]. This means that if the clinical suspicion is high, repeated testing is necessary. RT-PCR is thereby a time-consuming procedure and resources are or can become scarce during an outbreak.

Large-scale RT-PCR testing can be challenging, and during an outbreak peak, it can be unfeasible to wait hours for RT-PCR results to identify, isolate, and treat patients at an ED. Recent studies suggest that chest computed tomography (CT) could be leading in triage and aid in the early diagnosis of patients suspected of COVID-19 infection [7]. Recent studies from China reported high sensitivities (97–98%) for chest CT in the early detection of COVID-19 in patients who later had a positive RT-PCR test result [5, 6]. The National Health Commission of the People’s Republic of China even stated that diagnosis of COVID-19 could solely be based on chest CT findings [8]. Typical abnormalities seen on chest CT in COVID-19 patients [9,10,11] are also seen in patients with an initial false negative RT-PCR result [12].

Earlier attempts were also made to create a machine learning prediction model for COVID-19, based on chest X-ray and additional data in order to improve diagnostic accuracy of conventional chest X-ray alone [13]. For chest CT, this has not been done yet.

This study was initiated at the start of an outbreak peak in the Netherlands when RT-PCR testing capacity was considered to be a potential bottleneck in the flow of patients at an ED. The aim of this study was to investigate the diagnostic performance of chest CT compared with the first RT-PCR test in adult patients suspected of COVID-19 infection in the setting of the ED. In addition, we also constructed and internally validated a predictive machine learning model based on chest CT and additional data sources such as laboratory data for early prediction of COVID-19 infection. The ultimate goal was to assess whether chest CT could be used to substitute RT-PCR testing in triage during COVID-19 outbreaks where scarcity of RT-PCR tests would hinder efficient and rapid diagnosis, isolation, and treatment of patients.

Methods

Population and study design

This single-center prospective cohort study was performed at both locations of the Franciscus Gasthuis & Vlietland hospital in Rotterdam and Schiedam, the Netherlands, which has a level 2 trauma center with 48,000 visits annually at the ED’s. Waiver of informed consent was obtained by the medical ethical commission (MEC-U, W20.076).

Consecutive patients who visited the ED between March 27–April 20, 2020, and who met the following inclusion criteria for this study were included (a) age ≥ 18 years; (b) suspected infection with COVID-19 in combination with at least one of the following: (1) new respiratory symptoms persisting for ≤ 2 weeks and present during the last 24 h, (2) saturation ≤ 94% and/or respiration rate ≥ 20/min and/or abdominal complaints, and (3) a high clinical suspicion even in the absence of symptoms; and (c) RT-PCR and chest CT performed within 24 h after each other.

Exclusion criteria for this study were (a) previously confirmed COVID-19 infection; (b) instability defined as a peripheral oxygen saturation < 92% despite 5 l of oxygen and/or a systolic blood pressure < 90 mmHg; (c) principle presentation due to high energetic trauma, thrombolysis, or acute coronary syndrome; (d) pregnancy; and (e) non-interpretable first RT-PCR result.

Study procedures

COVID-19 suspected patients were triaged by a nurse in a triage tent especially set-up for the crisis as part of routine care. The Manchester Triage System [14] was used to triage patients; additional symptoms related to possible COVID-19 were assessed and vital signs were registered. Thereafter, arterial blood gas, blood samples, and nasopharyngeal swabs were obtained from all patients. Nasopharyngeal swabs were sampled from the oral cavity and subsequently from the nasal cavity using the same swab. After swabs were obtained, patients immediately underwent a chest CT, after which the physician took the medical history and performed a physical examination.

Chest CT and blood results followed within 60 min. Based on the clinical performance of the patient and the test results, a decision was made to admit the patient or not. RT-PCR results would mostly follow after 5–12 h. Due to scarce resources, repeated RT-PCR testing, preferably from sputum samples, was only performed in admitted patients in case of a persisting high suspicion for COVID-19 despite earlier negative RT-PCR results.

Patient data

Data of additional variables was extracted from electronic medical records of all patients included in the study. These variables encompassed demographic information, information about ED triage, COVID-19 presenting symptoms and vital signs, laboratory, microbiology, and CT results, and treatment-related variables (Online Resource 1).

Chest CT, RT-PCR, and laboratory assays

All chest CT images were obtained with patients in supine position in one of five CT scanners (Canon Aquilion One Genesis 320 slices, Canon Aquilion Prime 80 slices, Philips Brilliance 64 64 slices, Philips Big Bore 16 slices, Siemens Symbia T16 16 slices). Twenty board-certified radiologists were trained to read the CT images and classify them according to the CO-RADS classification, which was recently created by the Dutch Association of Radiologists (NVVR) [15]. Examples of CT images of all 5 CO-RADS categories are shown in Fig. 1. A CO-RADS score of 1–3 was classified as non-COVID-19,whereas a CO-RADS score of 4 or 5 was classified as COVID-19 positive (Online Resource 2). A standardized reporting format was developed. Two independent radiologists were consulted in case of any doubt about the classification.

Fig. 1
figure 1

a CO-RADS 1: A few fibrotic bands in the lower lobes. No evidence of infection. RT-PCR−. b CO-RADS 2: Bronchial wall thickening, small centrilobular nodules, and tree in bud abnormalities in the left upper lobe. Consistent with bronchiolitis. RT-PCR−. c CO-RADS 3: Consolidation with surrounding ground glass opacity in right upper lobe. RT-PCR−. d CO-RADS 4: Bilateral areas of patchy ground glass opacity with associated small peribronchovascular consolidations. Predominantly central distribution. RT-PCR+. e CO-RADS 5: Bilateral peripheral ground glass abnormalities with areas of associated consolidation. RT-PCR+

RT-PCR was performed according to the national reference method that was established after international collaboration [16], or by the CE-IVD kit GeneFinderTM COVID-19 Plus RealAmp Kit using the sample-to-result platform ELITe InGenius®.

Hematology tests were performed on Beckman Coulter DxH-800 and Sysmex XN-10 analyzers. Clinical chemistry tests were performed on Abbott Architect c-8000 and i-2000 platforms. Blood gas analyses were performed using Werfen GEM-4000 and Siemens RAPIDlab 1265 analyzers. Siemens CS-2100i and CS-5100 analyzers were used to determine fibrinogen concentrations.

Groups

With RT-PCR as reference, four groups were identified based on combining chest CT and RT-PCR outcomes, namely, true positives (TP) with a positive RT-PCR and a positive chest CT, false positives (FP), true negatives (TN), and false negatives (FN).

Statistical analysis

Statistical analysis was performed using SPSS version 26 and the scikit-learn machine learning library for Python version 3.7.

Characteristics of patients were summarized using the mean (± SD) or median (IQR) for continuous variables and counts and percentages for categorical variables.

Normality of variables was tested using Shapiro–Wilk tests. Normally distributed variables were compared using unpaired t-tests, while non-normally distributed variables were compared using Mann–Whitney U-test. Nominal variables were tested using Pearson’s chi-square tests.

Diagnostic performance of chest CT was assessed as diagnostic accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV), and positive likelihood ratio (LR+) and negative likelihood ratio (LR-), taking the first RT-PCR result as reference. These measures are summarized as proportions with (exact) 95% confidence intervals (CI). A ROC was constructed using logistic regression.

A prediction model for COVID-19 diagnosis based on radiological, laboratory, and clinical data was constructed using logistic regression while using scikit in order to improve the diagnostic accuracy of CT. Again, the first RT-PCR was taken as a reference. Regression analysis was performed on complete cases (no missing variables). Due to missing variables in some patients, fewer patients could be used to train the model than could be used to evaluate chest CT performance. Variables were checked for multicollinearity by calculation of variance inflation factors (VIF). A VIF higher than 4 was determined to be unsatisfactory due to high collinearity [17]. The model was validated using 10-fold cross validation, and a ROC curve was obtained from logistic regression probabilities.

Results

Demographical and clinical patient characteristics

Approximately 2100 patients were presented to the ED during our inclusion period. Both a nasopharyngeal swab and a chest CT were obtained from 404 of these patients. Of these, 85 patients retrospectively did not meet the case definition and were subsequently excluded leaving 319 patients eligible for analyses (Fig. 2). Of these patients, 186 had a negative RT-PCR result and 133 had a positive RT-PCR result. There were no significant differences in the presence of comorbidities (Table 1). Fever, coughing, dyspnea, myalgia, malaise, and diarrhea were the symptoms experienced most often in RT-PCR positive patients, whereas RT-PCR negative patients more often experienced a sore throat, were more often current smokers, and had fewer moments of contact with confirmed COVID-19 patients. No differences were found for risk factors like obesity, recent travel to high-risk areas, or employment in the healthcare sector. At admission, average temperature and respiratory rates were higher and median saturation levels were lower in patients with a positive RT-PCR result. Finally, patients with a positive RT-PCR result were more likely to be admitted, before RT-PCR results were known, compared with patients with a negative RT-PCR result.

Fig. 2
figure 2

Flowchart of study with included and excluded patients

Table 1 Characteristics of COVID-19 suspected patients at the ER

Performance of CO-RADS score in the diagnosis of COVID-19 compared with first RT-PCR

Table 2 shows the performance characteristics of chest CT using the CO-RADS score, compared with the first RT-PCR. Figure 3 shows the ROC curve with an AUC of 0.914 (0.879–0.949), and Fig. 4 illustrates the percentage of RT-PCR positive and negative patients per CO-RADS subgroup: 95% of patients with a CO-RADS score of 1 had a first RT-PCR which was negative, whereas 90% of patients with a CO-RADS score of 5 had positive RT-PCR results. Of all patients, 4.1% tested false negative on CT and 6.9% tested false positive. In this cohort with a COVID-19 prevalence of 41.7%, CT scan using the CO-RADS scoring yielded a PPV of 84.5% and an NPV of 92.7%.

Table 2 Performance of CORADS score in classification of COVID-19 patients
Fig. 3
figure 3

ROC curve of CO-RADS score for COVID-19 diagnosis, taking the first RT-PCR result as a reference. AUC 0.914 (0.879–0.949)

Fig. 4
figure 4

PCR results for each separate CO-RADS category

Differences between groups as classified using chest CT (CO-RADS) scoring

FN patients experienced fever, coughing, dyspnea, chest pain, and malaise more often compared with TN (Table 1). The FN group showed the highest number of obese patients. Most current smokers were TP. Patients in the FP group had a median longer duration of complaints compared with patients from the other three groups (11 vs 7 days). Interestingly, false negative patients presented more often with CO-RADS 1 than CO-RADS 2 or CO-RADS 3.

Vital signs and laboratory results on admission also differed between groups. Both true positive and false negative patients needed oxygen more often, and also in higher levels. Elevated levels of ferritin, procalcitonin, lactate dehydrogenase (LDH), C-reactive protein (CRP), and creatine kinase (CK), as well as leucopenia, lymphopenia, and neutropenia were most often seen in TP patients and least seen in TN patients. Of true positives, 80.8% were admitted, followed by FN (76.9%), FP (63.6%), and TN (59.1%).

A predictive machine learning model based on chest CT and additional data sources

Our prediction model is based on the CO-RADS score from chest CT complemented with the following additional variables: ferritin, leucocyte count, CK, the presence of diarrhea, and the number of days since onset of disease. Figure 5 shows the ROC curves of CO-RADS alone and the prediction model. For performance characteristics see Online Resource 3, and for VIF scores of the variables see Online Resource 4. The prediction model has a significantly higher specificity than the CORADS score alone in categorizing COVID-19 patients (0.934 vs 0.886), with comparable sensitivity (0.910 vs 0.893). K-fold cross validation (k = 10) was performed on this prediction model to check for overfitting. This resulted in an accuracy of 0.91 ± 0.10 and an adjusted R2 of 0.652.

Fig. 5
figure 5

ROC curves of CO-RADS alone (green line) and the prediction model (yellow line). Accuracy 10-fold cross validation is 0.91 ± 0.10. AUC for CO-RADS alone is 0.920 and 0.953 for the prediction model

Discussion

In this prospective observational study during the peak of the COVID-19 outbreak in the Netherlands, 319 patients with suspected COVID-19 presenting to the ED were assessed by chest CT and nasopharyngeal swabs. With a sensitivity of 90.2% and specificity of 88.2%, the chest CT performed relatively well as a diagnostic modality compared with the first RT-PCR performed in COVID-19 suspected ED patients. Additionally, we constructed a machine learning prediction model by using variables which were readily available during ED visit. Ferritin, leucocyte count, creatine kinase, presence of diarrhea, and the number of days after start of complaints were predictors of COVID-19 infection in addition to the CO-RADS classification. With an AUC-ROC of 0.953, our model showed excellent performance implying that this model could possibly further aid clinicians in early recognition and diagnosis of COVID-19.

We found a sensitivity of chest CT using CO-RADS scores which was only slightly lower (90.2%) as compared with the first European study from Caruso et al. who reported a sensitivity, specificity, and diagnostic accuracy of 97%, 56%, and 72%, respectively [18] for chest CT for COVID-19 and used repeat RT-PCR as a reference. Other studies performed in China, who classified chest CT as positive based on consensus among radiologists, reported higher sensitivities for chest CT of 97% [6] and 98% [5]. Our slightly lower sensitivity could be an underestimation as 4/22 false positive patients later tested positive for COVID-19 (repeated RT-PCR or SARS-CoV-2 serology) during admission. It is plausible that these 4 patients were initially misdiagnosed due to the fact that RT-PCR tests are prone to sampling errors related to the quality and method of sampling [16].Repeated testing is recommended to decrease the number of FN results as previous studies showed that it is possible for patients with initial negative RT-PCR results to turn positive over time [12, 19, 20]. Due to limited RT-PCR testing capacity during the peak of the outbreak, repeated testing was only available and deemed meaningful for admitted patients with a high clinical suspicion of COVID-19 infection meaning that the remaining 18 FP patients were never retested.

Our FN group consisted of 13 patients. In the setting of this study, we decided not to perform repeat chest CTs unless necessary for other medical reasons. This renders us unable to account if, and how many patients, later developed chest CT features matching a CO-RADS 4 or 5 classification. Bernheim et al. [21] studied possible causes of misinterpretations of chest CTs in COVID-19 positive (RT-PCR) patients and found that 56% of patients who received a scan during early onset of symptoms (0–2 days) and 9% of patients within the intermediate group (3–5 days symptoms) had a normal chest CT at that moment without any abnormalities. Another study found that chest CT abnormalities are moreover found after symptoms persisting for about 10 days [22]. The fact that 4/13 FN patients had complaints shorter than 6 days could therefore account for a part of the FN results.

The added value of our prediction model, compared with chest CT alone, is mainly the reduction of false positives, thus increasing specificity, PPV, and LR+, probably due to the fact that certain features of COVID-19 (e.g., high ferritin levels, leukopenia, and high CK) are not present in other pathologies that result in high CO-RADS scores. In this study, the accuracy of the predictive model was higher compared with chest CT alone (93.1% vs 90.4%). In addition, 30% of RT-PCR negative patients and 21% of RT-PCR positive patients are correctly classified without generating FP or FN results using cutoff thresholds with 100% sensitivity and 100% specificity, respectively.

A strength of our study is that we constructed a machine learning predictive model by using readily available ED variables and CT characteristics. Something previously done with chest X-rays instead of chest CT [13]. Conventional chest X-ray could be limited since pulmonary lesions can be ambiguous or absent while they are already there on chest CT the same day [23]. Our prediction model showed that in addition to the CO-RADS score, ferritin, leucocyte count, creatine kinase, diarrhea as a symptom, and the number of days since onset of disease enhanced the prediction capacity for COVID-19 infection. An association between the laboratory parameters we included in our model and COVID-19 has also been reported in other studies [1, 3] and our findings confirm these previous observations. Our model’s parameters were found to have acceptable VIF’s (see Table, Supplemental Digital Content 5); the model has an adjusted R2 of 0.652 and has a k-fold cross validation accuracy of 0.91 ± 0.10.

Finally, our study has several limitations. Firstly, this was a single-center study. Secondly, only the first RT-PCR result was used as a reference as repeated RT-PCR was only performed on admitted patients with persistent high suspicion of COVID-19. Thirdly, we did not perform repeated chest CT’s in FN patients unless necessary for other medical reasons.

Conclusion

Chest CT, using the CO-RADS scoring system, is a sensitive and specific method that can aid in the diagnosis of COVID-19 at the ED during the peak of an outbreak, especially if PCR tests are scarce. Combining a predictive machine learning model could further improve the accuracy of diagnostic chest CT for COVID-19. Large prospective studies should further analyze additional candidate predictors that could be used to improve the performance of this machine learning prediction model. Still, 8–9% of patients in our cohort with RT-PCR positive results were classified as negative by chest CT and our prediction model. Therefore, RT-PCR is still indispensable in the diagnosis of COVID-19 and should remain the primary standard of testing.