BACKGROUND

Identifying people who are infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of COVID-19 disease, is an important component of limiting its spread in the community.1 The detection and isolation of community-based people who are infected with the virus should significantly decrease spread of COVID-19 disease.2 Appropriate testing returns essential information regarding the pandemic.3

Being able to accurately predict the risk of SARS-CoV-2 infection would be helpful. First, resources to test for SARS-CoV-2 may be limited,4,5 making the selection of people for testing essential to infection control.1 In such situations, selectively testing people with higher infection risks will maximize the number of identified cases for a testing capacity. Second, expanding SARS-CoV-2 testing to measure its transmission and current status would benefit from risk stratification to maximize case identification.

Several models to predict SARS-CoV-2 infection probability have been published. Sun et al.6 studied 788 community-based people in Singapore between January 26 and February 16, 2020 with a SARS-CoV-2 infection rate of 6.8% to generate a risk prediction model including age, symptomatic information, and vital signs. Their model had limited discrimination (c-statistic = 0.65), its calibration was not measured, and no model validation was reported. Meng et al.7 used 620 cases and controls to generate a model having age, sex, and a panel of 35 laboratory tests to predict SARS-CoV-2 infection. They cited excellent discrimination (c-statistic = 0.87) in a validation population of 145 people (having SARS-CoV-2 incidence of 55.2%). However, the model’s requirement for laboratory information limits its applicability to community screening since a venipuncture and laboratory processing time would be required (i.e., a 2-phase screening process). Menni et al.8 derived and validated a model based on the self-reported symptoms and test results of 18,401 people. The model had good discrimination (c-statistic 0.75) and found important associations between infection risk and anosmia, anorexia, and fatigue.

In this study, we derived and assessed a model that returns the probability of SARS-CoV-2 infection in community-based people.

METHODS

Study Setting, Study Cohort, and Data Collection

This study used data from The Ottawa Hospital (TOH) and was approved by The Ottawa Hospital Research Ethics Board. TOH opened a community-based COVID-19 testing center on 13 March 2020 to facilitate community testing for SARS-CoV-2. The center conducted an average (standard deviation) of 229 (73.1) tests per day. Test resulting increased rapidly; in the first week of the clinic, the median time to resulting was 11 days (interquartile range 3–12); by the start of April 2020, 98.7% of tests were resulted within 24 h. Both nasopharyngeal and throat samples were used based on testing availability of the former; testing methodology was not influenced by other factors. Time to test result was not independently associated with the likelihood of SARS-CoV-2 infection.

Our study included all people who were tested between 13 March (when the clinic opened) and 21 April 2020 (final day that data were complete when model development started). People presenting to the testing center were questioned at two separate times by different registered nurses. Testing screening criteria were based on current Ontario Public Health testing guidelines and were used to increase testing coverage in populations where case identification was a policy priority (i.e., healthcare workers) while avoiding unnecessary testing in the general population who were asymptomatic or whose symptoms could be related to non-COVID-19 disease. Clinic screening questions elicited the presence of symptoms including rhinorrhea; fever symptoms including rigors, chills, perceived fever, or documented fever at home or at the screening clinic; cough; and shortness of breath. People with any of these symptoms underwent testing. Testing also occurred if any infection risk factor was present including close contact with a person with known or presumed COVID-19 disease or recent travel outside of Canada (Appendix A). In the absence of these indications, healthcare workers (or people cohabitating with a healthcare worker) were tested if they had symptoms of sore throat, sputum production, or rhinorrhea. Finally, testing could be approved in the event of extenuating circumstances or if the person was referred to the screening clinic by public health officials for testing. Explicit criteria were not used for each factor; we instead relied upon the clinical judgment of screening personnel. Answers to all screening questions were registered in an electronic medical record along with the person’s age and sex.

Testing for SARS-CoV-2 Infection

Testing for severe acute respiratory syndrome coronavirus 2 virus (SARS-CoV-2) was performed using the Allplex 2019 nCoV reverse transcriptase polymerase chain reaction (RT-PCR) that targeted the RNA-dependent RNA polymerase, E and N genes (Seegene, South Korea). Tests were classified as positive for SARS-CoV-2 infection if at least one of any of the three “targets” were detected. Prior to implementation, the RT-PCR was evaluated by testing 113 clinical samples that were negative for other respiratory viruses. As well, a total of 191 clinical specimens (including 172 negative and 19 positive samples negative for SARS-CoV-2 by our RT-PCR) were tested in parallel by the Public Health Ontario Laboratory (PHOL) using an in-house developed RT-PCR. Based on this limited evaluation, the sensitivity and specificity of our testing methodology were 100% and 99% respectively (one sample was indeterminant by the Allplex 2019 nCoV RT-PCR and negative by PHOL).

Model Covariables

We clustered together individual covariables described above that were strongly associated with each other and were thematically related. This resulted in the creation of cluster variables for fever symptoms (deemed present if patient reported feverishness, chills, rigors, or a fever was documented at home or at the screening clinic) and chest symptoms (deemed present if patient reported any cough or shortness of breath). We also included an age-fever interaction because inflammatory exuberance decreases as people age9 and we reasoned that the predictive capacity of these symptoms might change with age.

We wanted our model to account for baseline SARS-CoV-2 prevalence since this would importantly influence the likelihood a particular person tests positive. We summarized this as the recent case detection rate, calculated as the proportion of tests from our testing clinic in the previous 3 days that were SARS-CoV-2 positive:

$$ \frac{\#\mathrm{tests}\ \mathrm{in}\ \mathrm{previous}\ 3\ \mathrm{days}\ \mathrm{in}\mathrm{dicating}\ \mathrm{SARS}-\mathrm{Cov}-2\ \mathrm{infection}}{\mathrm{total}\ \mathrm{number}\ \mathrm{of}\ \mathrm{tests}\ \mathrm{in}\ \mathrm{previous}\ 3\ \mathrm{days}} $$

Model Derivation

Each person’s demographic data and screening question answers were linked to SARS-CoV-2 test results. If individuals underwent testing more than once, data from their first test only were included in the model. There were no missing data for model covariables.

All analyses used SAS 9.4 and R 3.3.3. Multivariable binary logistic regression was used to measure the independent association of each term with detection of SARS-CoV-2. Since we had a limited number of covariables and we had enough cases in the sample to include them all, we did not conduct any variable selection for the model (thus avoiding the possibility of “testimation bias”10). Recent case detection rate was expressed in the model as the natural logarithm of the odds (i.e., ln(p/[1 − p])). To ensure that this represented the probability of SARS-CoV-2 positivity in the ‘average’ person, we centered all other covariables in the model. All continuous variables (age, age-fever interaction, transformed recent case detection rate) were expressed using restricted cubic splines having 5 knots (using knot positions suggested by Harrell11).

When predicting infection likelihood, the logistic model intercept returns infection probability of a subject whose model covariates all have a value of 0. The intercept is necessary to calculate expected probabilities but its value changes with infection prevalence. This makes intercept-based logistic models impractical for predicting infection likelihood when its prevalence varies, as in our study. We therefore did not include an intercept in our logistic model knowing that “recent case detection rate” would capture baseline prevalence. This ensured that each person’s expected probability of being infected with SARS-CoV-2 could be calculated without needing to specify an intercept.

From our final model, we created the SARS-CoV-2 Risk Prediction Score (SCRiPS) by multiplying, for each factor, the value-coefficient product by 10 and rounding to a whole number. An individual’s SCRiPS was calculated by summing up the points associated with the value for each of their factors.

Model Validation

We used 1000 bootstrap samples (with replacement) to evaluate the model. In each bootstrap sample, we measured SCRiPS discrimination (using the c-statistic) and calibration. The latter was assessed in each bootstrap sample using three methods: (i) calibration in the large (observed and expected proportion of SARS-CoV-2 infection was compared in the entire study sample); (ii) calibration slope and intercept (the study sample was separated into deciles based on the expected risk and a linear model regressed observed against expected SARS-CoV-2 infection risk in each group); and (iii) the integrated calibration index12 (which measures the absolute difference between the calibration curve and the diagonal line of perfect calibration). The latter was summarized into a calibration curve.

Finally, we measured the potential impact of using SCRiPS to screen for SARS-CoV-2 testing. We first used the recent case detection rate as a testing threshold for the SCRiPS-determined expected infection risk (since thresholds equal to the case prevalence maximize a predictive model’s impact).13 Within each bootstrap sample, we calculated model sensitivity and negative likelihood ratio. We compared the net benefit of using the SCRiPS testing criteria to that from testing all people and calculated the net benefit increase (the difference between these testing approaches in the proportion of true positives and false positives, with the latter weighted by the testing threshold expressed as an odds term).13 We also calculated the change in the number of identified infections if tests that would have been applied to people not meeting testing criteria were instead used on those meeting testing criteria. These calculations were repeated using a more conservative threshold of one-half of the recent case detection rate. We used the percentile method to generate point estimates and 95% confidence intervals for all fit statistics.14 We reported our methods and results using recommendations from TRIPOD15 (Appendix C).

RESULTS

Of all 9611 tests conducted up to 21 April 2020, 9585 (99.7%) had final results reported at the time we created our analytical dataset (24 April 2020). 413 (4.3%) of these tests were on people with previous testing and were excluded leaving 9172 people for model derivation. A total of 571 tests indicated SARS-CoV-2 infection for an overall infection rate of 6.2%. However, this risk changed notably over time with a peak risk of 15.4% in early April (Fig. 1). Infected people were slightly older and were less likely to be female or a healthcare worker (Table 1). Infected people were also more likely to have contacted a known or suspected COVID-19 case. Recent case detection rates were slightly higher in SARS-CoV-2 positive patients. Rhinorrhea was less common in the infected but fever symptoms were much more prevalent. Chest symptoms and travel history did not differ notably by SARS-CoV-2 status.

Fig. 1
figure 1

Probability of SARS-CoV-2 detection during study period. This graph presents the probability of SARS-CoV-2 infection (vertical axis) over time (horizontal axis) at the study COVID-19 testing centre. The blue line indicates the daily percentage of tests positive for SARS-CoV-2. The red line indicates the proportion of tests positive for SARS-CoV-2 for the three previous days (“recent case detection rate”). The latter value was used in the SCRiPS model.

Table 1 Description of Study Cohort (n = 9172)

The final model included ten covariables and one interaction term (Appendix B). SARS-Cov-2 likelihood increased most notably by contact with known or suspected COVID-19 case (48.3% of total model χ2) and the presence of fever or symptoms thereof (16.6% of total model χ2). Other factors increasing the likelihood of SARS-CoV-2 infection included male gender (adjusted odds ratio [adj-OR] 1.28 [95%CI 1.07–1.54]), non-healthcare worker status (adj-OR 1.83 [1.49–2.25]), having a throat swab rather than nasopharyngeal sample (adj-OR 1.36 [1.09–1.69]), and an absence of rhinnorhea (adj-OR 1.36 [1.13–1.64]). Recent case detection rate also importantly influenced infection likelihood (9.2% of total model χ2). Lung symptoms, age, and the interaction of age and fever had less influence on the likelihood of SARS-CoV-2 infection.

The model was modified to the SARS-CoV-2 Risk Prediction Score (SCRiPS)(Table 2). SCRiPS values for each covariable reflect the extent that term influenced the likelihood of SARS-CoV-2 infection (Table 2A) with higher values indicating an increased influence on infection probability. Of categorical factors, contact with known or suspected COVID-19 case was most influential (12 points). The scoring system illustrates that SARS-CoV-2 likelihood increased with recent case detection rates. When fever symptoms were present, the likelihood of SARS-CoV-2 increased strongly as people aged. If fever symptoms were absent, this likelihood did not change consistently as patients aged.

Table 2 SARS-CoV-2 Risk Prediction Score (SCRiPS). ACalculation of SCRiPS. BExpected Probability of SARS-CoV-2 Infection by SCRiPS

Values for each model term were summed to calculate an individual person’s SCRiPS (mean value 34.8 [standard deviation 8.6]; range 9–70). A 1-unit increase in SCRiPS increased the odds of infection by 11.3% (odds ratio 1.11, 95%CI 1.10–1.12). In 1000 bootstrap samples, SCRiPS was moderately discriminative (c-statistic 0.736, 95%CI 0.715–0.757). SCRiPS-derived expected probabilities of infection were well calibrated to observed probabilities: difference between overall observed and expected infection probability (i.e., “calibration in the large”) did not exclude zero (0.005%, 95%CI − 0.46–0.50%); the intercept of the calibration slope also included zero (− 0.0016, 95%CI − 0.008–0.0052) and the slope did not differ significantly from 1 (1.025, 95%CI 0.895–1.157); and the integrated calibration index also indicated very good calibration (0.0083, 95%CI 0.0048–0.0131). Calibration plots indicated that SCRiPS-based infection risk did not significantly differ from expected risk except in the 10–20% range when SCRiPS slightly underestimated infection risk (Fig. 2).

Fig. 2
figure 2

Observed and expected probability of SARS-CoV-2 infection in bootstrap validation. In 1000 bootstrap samples, the observed proportion of tests that were positive for SARS-CoV-2 (vertical axis) were plotted against the expected proportion based on the SCRiPS (horizontal axis). The heavy black line is median LOESS regression value; it is flanked by 95% confidence interval (gray, dotted lines). The dashed diagonal line represents perfect agreement between observed and expected probabilities.

Using the SCRiPS-based infection risk had a potentially large impact on test utilization and case identification (Table 3). Limiting testing to those whose SCRiPS-based infection risk equaled or exceeded the recent case detection rate returned a sensitivity of 68.8% and negative likelihood ratio of 0.524 (Table 3A). This significantly improved test utilization compared to testing all people (net benefit increase 0.015, 95%CI 0.012–0.024). If tests in people not meeting SCRiPS testing criteria were instead redirected to those who did, the number of identified SARS-CoV-2 cases would increase 63.1% (95%CI 54.5%–72.3%). Limiting testing to those whose SCRiPS-based infection risk equaled or exceeded 50% of recent case detection rates returned a sensitivity of 90.0%, a significant net benefit increase (0.012, 95%CI 0.010–0.018), and an increase in the number of identified SARS-CoV-2 cases of 27.6% (23.3–30.7%).

Table 3 Potential Impact of SCRiPS on SARS-CoV-2 Test Utilization and Case Identification

DISCUSSION

This study derived and assessed a model that accurately estimated the probability of SARS-CoV-2 infection. This expected probability can be used to importantly increase SARS-CoV-2 infection identification when testing capacity is limited.

Accurate case finding is key to resolving the current COVID-19 epidemic. However, significant limitations in testing capability exist in Canada and elsewhere due to a lack of analyzers, swabs, transport media, laboratory reagents, trained staff, or overwhelming demand. Since many healthcare regions could continue to have testing limitation for many months, rationing of some form may be necessary. In such situations, preferentially testing those having the highest infection likelihood will maximize case identification (Table 3).

We found that the likelihood of SARS-CoV-2 infection in community-based people can be accurately estimated using relatively easily attained information. Expected SARS-CoV-2 infection probabilities have two potential applications. First, they would permit healthcare workers to accurately risk-stratify people regarding infection likelihood. This would improve utilization of potentially scarce testing resources. For example, we found that limiting testing to those with an expected infection probability equal to or exceeding the recent case detection rates increased case identification by 63% (Table 3A). Second, the accuracy of SARS-CoV-2 testing is uncertain. SCRiPS could help clinicians reliably process testing results by incorporating infection likelihood based on SCRiPS. For example, a negative test result in a person deemed at high-risk of infection due to an elevated SCRiPS would suggest a false-negative test result and prompt retesting.

Based on PROBAST criteria, we would rate our model as having a low risk of bias16: we used an appropriate and complete patient population; all predictors were available and determined prior to test results; the model outcome was clearly defined and determined independent of all predictors; and our analysis included a reasonable number of participants, handled covariables appropriately, avoided variable selection based on univariable testing, included extensive model assessment measures, and avoided overfitting given our lack of variable selection and large sample size. However, several issues should be kept in mind when the SCRiPS is used. First, our model was derived on a single community-based sample that predominantly met basic COVID-19 screening criteria. Its applicability to other communities and in hospitalized patients or people not meeting the testing criteria used at the screening clinic needs examination. Second, several SCRiPS components were based on historical information and were not explicitly defined a priori or applied by the screening personnel in a standardized fashion. This could have limited the model’s performance. Third, the model requires recent case detection rates for SARS-CoV-2. To accurately use SCRiPS, centers will require quick laboratory turn-around times as well as the capacity to analyze these data quickly. Fourth, it is noteworthy that the SCRiPS performed well in a bootstrap sample. Further evaluation of the SCRiPS model in other populations will be needed to truly evaluate its utility. Fifth, we found that healthcare workers were significantly less likely to test positive for SARS-CoV-2 infection (Table 2A, Appendix B). We do not believe that this has any biological basis (i.e., healthcare workers are not less likely to become infected with SARS-CoV-2). Instead, this likely reflects our screening clinic’s more inclusive testing criteria for healthcare workers or they having a lower threshold to seek testing. This highlights that SARS-CoV-2 infection likelihood, and therefore, components of its predictive model can be significantly influenced by testing behavior. Sixth, the SCRiPS model highlights the importance of disease prevalence when estimating infection likelihood (Table 2A). Inaccurate estimates for this parameter could bias probability estimates from SCRiPS. Seventh, the outcome for our model—i.e., results from a standard test for SARS-CoV-2—is clinically relevant but could have some limitations. While the testing methodology used for our study returned results almost identical to those from reference laboratories, our test might have missed people who were truly infected because viral shedding kinetics varies with disease severity and sampling time from symptom onset. Finally, while the SCRiPS appears to be an effective model, further research will be needed to determine how its utility could be enhanced. This might be achieved by incorporating additional historical data identified by Menni et al.8 to be strongly associated with infection likelihood including anosmia and anorexia or incorporating symptom duration or severity.

In summary, SCRiPS accurately estimates the likelihood of SARS-CoV-2 positivity in community-based people. This model could be used to risk-stratify people being screened for infection and improve COVID-19 identification.