Introduction

Following exposure and recovery from SARS-CoV-2 infection, convalescent patients develop antigen specific adaptive T- and B-cell immune responses including binding and neutralizing antibodies (Ab). The antibody levels, in response to SARTS-CoV-2, rise in the first 2 weeks, with serum levels of SARS-CoV-2 specific IgM rising within the first week after infection and about a week later for IgG with the seroconversion times around 10–14 days respectively1,2. Although it is not perfectly clear how long a protective antibody response will typically last, it has been reported that the antibody titers seem to wane down as soon as 12 weeks after symptom onset reaching with baseline levels after 52 weeks1.

The genome of SARS-CoV-2, is composed of a single strand RNA that expresses structural and non structural proteins. As shown on previous studies, the structural proteins Spike (S) and nucleocapsid (NP) are able to elicit specific antibody production and are often targets for serological diagnostic tests3. The S protein is composed of two subunits (S1 and S2) cleaved during the infection as a necessary step for infection. The receptor binding domain (RBD) is a critical domain of the S1 subunit, responsible for binding to the host cell ACE receptor and is the main target for neutralizing antibodies4.

These immune responses may prevent reinfection or blunt the clinical consequences of future infectious exposures to the virus. Administration of COVID-19 convalescent plasma (CCP) from recovered patients is being employed for therapeutic use based on the belief that factors, including neutralizing Ab against SARS-CoV-2, present in the plasma may inhibit virus replication and improve patient outcomes. As the use of CCP evolved, it became apparent that not all CCP contained effective neutralizing efficacy, and anti-SARS CoV2 activity correlated with clinical outcomes5,6.

Numerous clinical trials are underway aimed at understanding how best to administer CCP and to test the hypothesis that it is an effective treatment. A significant clinical trial involving more than 35,000 CCP treated COVID-19 patients has shown an efficacy signal from plasma with elevated Ab levels in patients treated within a few days after symptom onset7. Previous efforts to use convalescent plasma for treatment of infectious diseases without knowledge of activity8 have produced mixed results, possibly associated with differences in the antibody profile from different donors. Differences in the breadth and level of SARS-CoV-2-specific Ab in CCP may correlate with therapeutic efficacy. Consequently, we investigated the performance of a coronavirus antigen microarray (COVAM) in order to characterize and classify Ab reactivity in CCP to optimize therapy before it is administered to patients.

COVAM is a multiplex assay platform for high-throughput serological studies. The microarrays are produced by printing validated and purified recombinant antigens on nitrocellulose-coated slides. The COVAM has 11 SARS-CoV-2 antigens, 5 SARS-1, 5 MERS and 12 seasonal coronavirus antigens, as well as 35 antigens from 5 other viruses that cause acute respiratory infections9,10. A complete list of the COVAM antigens can be found in Supplementary Table 1. The arrays are designed to determine the Ab profile in serum or plasma samples to confirm prior exposure following suspected infection with the viruses and to monitor Ab changes over time. The arrays are designed for high throughput and low-cost testing and hence are suitable for epidemiology studies as a serologic surveillance tool to determine the prevalence and levels of Abs indicating viral exposure within individuals over the course of an epidemic; and to monitor post-vaccination immune response to guide recipient risk behavior.

Here we probed the COVAM with CCP collected from 99 donors between 4/18/2020 and 5/6/2020 from 8 regions across the US. Although the donors were all recovered SARS-CoV-2 PCR+ COVID-19 cases, their Ab response profiles against 11 SARS-CoV-2 fell into distinct groups with different Ab levels and breadth of the responses particularly to 4 SARS-CoV-2 antigens. If these classification groups correlate with CCP efficacy, a SARS-CoV-2 antigen microarray may be useful to qualify and select CCP for treatment of acute COVID-19.

Results

COVAM, RVPNT and VITROS CoV2T chemiluminescent immunoassay assay concordance

The COVAM has 11 SARS-CoV-2 antigens, 5 SARS-1, 5 MERS and 12 seasonal coronavirus antigens. The complete list of the COVAM antigens, in the order displayed in the figures, is available in Supplementary Table 1. We probed COVAM with CCP collected from 99 US PCR confirmed COVID-19 plasma donors who recovered from the infection 1–2 months prior to CCP collection. The multiplex COVAM results were analyzed using two computational approaches, either a generalized linear model (glm) or random forest (RF) to classify individual specimens as either reactive or non-reactive against the SARS-CoV-2 antigens. The binary prediction results were compared with the FDA EUA VITROS CoV2T chemiluminescence immunoassay from Ortho (reactive: S/C ≥ 1.0) and with an RVPNT assay (reactive: neutralization titer with > 5) inhibition of infection [NT50] > 40) developed at Vitalant and tabulated in Table 1.

Table 1 Reactivity classification of the CCP samples.

Compared to the VITROS CoV2T assay, there is 93%, and 91% concordance with the RVPNT, and COVAM predictions, respectively. Three specimens that were negative by the VITROS CoV2T assay were also negative by RVPNT and two of these for COVAM assays. Ninety percent of the specimens are completely concordant among all comparisons, including 88 seropositive by all tests and two seronegative by all tests. All of the discordances are found in ten samples that showed low positive titer on the VITROS CoV2T immunoassay. Among these, eight were considered non-reactive on the RVPNT test and three considered non-reactive on the COVAM prediction (using a 60% RF probability cutoff).

The COVAM data are summarized in the heatmap in Fig. 1. There are 99 specimen IDs along the x-axis and 65 antigens along the y-axis (Supplementary Table 1). The antigens are grouped according to the virus from which they are expressed. The top 11 antigens are from SARS-CoV-2 and the quantitative COVAM IgG Ab results from these antigens were used to cluster the specimens into 3 groups as shown in the dendrogram on top of the heatmap (Fig. 1).

Figure 1
figure 1

IgG COVAM reactivity heatmaps of 99 sera from recovered coronavirus convalescent cases. The heatmaps show the IgG reactivity levels of patients to antigens printed onto the COVAM array. Each column is the representation of a specific specimen while each row is represented as the mean reactivity of 4 replicates of each antigen. The COVAM array shows clear IgG reactivity to SARS-CoV-2 antigens that are clustered into three groups. A complete list of the antigens, in the order displayed, is available in the Supplemental Table 1. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

The normalized signal intensity on the COVAM for each antigen was compared with either the VITROS CoV2T S/C levels or the RVPNT NT50 titers. The antibody reactivity to SARS-CoV-2 S1 and RBD containing antigens present on the COVAM correlates with the VITROS CoV2T or the RVPNT titers. Most notably, the SARS.CoV.2.S1.HisTag antigen correlates to the VITROS CoV2T S/Cs and to the RVPNT NT50 titers with Spearman’s r = 0.65 and r = 0.655, respectively (Fig. 2a). The results of correlation analysis for the S1 Spike antigen in the COVAM array, are shown in Fig. 2. A summary of Spearman r values for each of the 11 SARS-CoV-2 COVAM are plotted in Fig. 2b. Overall, antigens that contain the RBD domain and lack the S2 domain show a high correlation between the assays, and the COVAM reactivity against the nucleocapsid protein NP and the S2 domain of spike show low correlation with the VITROS S1 based CoV2T or the RVPNT assays. This observation is not surprising due to S1 specificity of the VITROS CoV2T assay and the Ab-mediated neutralization of virus entry which is primarily based on blocking interactions between the RBD domain of S1 and ACE-2 receptors expressed on target cells in the assay.

Figure 2
figure 2

Correlation of the RVPNT assay NT50 values or VITROS CoV2T assay S/C values with the COVAM reactivity (RF values) to SARS-CoV-2 antigens. (A) Scatterplots representative of the correlation between the vsn normalized COVAM reactivities and the RVPNT or VITROS CoV2T assays to the S1 (HisTag) antigen. (B) The Spearman correlation (r) of the vsn normalized COVAM reactivities and RVPNT or VITROS CoV2T Assay to the COVAM SARS-CoV-2 antigens. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

CCP specimens cluster into 3 distinct groups according to Ab levels against SARS-CoV-2 antigens

Although concordance was observed in the binary predictions of the three assays, the multiplex COVAM assay allows for a multivariant cluster analysis metric to classify CCP into different groups based on their reactivity patterns against several antigens. The VITROS CoV2T or RVPNT assays produce a single value for each specimen and a binary (reactive or non-reactive) result when that value is above or below a reactivity threshold, with quantitation of Ab intensity above the S/C = 1.0 and NT50 = 40 categorical thresholds. The COVAM interrogates IgG levels against 11 SARS-CoV-2 antigens and uses a computational algorithm to determine a binary outcome by comparing the results to the training set of COVID positive and negative controls.

Unsupervised cluster analysis can sort the specimens according COVAM antigen specific IgG reactivity patterns into at least three distinct groups based on the level of antibodies against a collection of NP, Spike, S1, S2 and RBD antigens (Fig. 1). Group 1 specimens have high Ab levels against many of the SARS-CoV-2 antigens, most markedly high reactivity to nucleocapsid protein (NP), S1 domain of spike protein, and the full spike S1+S2 protein. Group 3 specimens have very low antibody levels to NP and S1 compared to the Group 1 specimens. The Group 2 specimens show an intermediate reactivity level against spike antigens. The horizontal bars on top of the heatmap indicate reactive (red) and non-reactive (blue) predictions for each specimen for the VITROS CoV2T, RVPNT and COVAM assays. All of the predicted non-reactive specimens from all three assays are in group 3. All of the group 1 and 2 specimens are reactive and concordant by all three assays.

The results in Fig. 3a show the IgG heatmap clustered (k-means), using the reactivity of the SARS-CoV-2 antigens. These can also be visualized in the PCA analysis (Fig. 3c) which shows 3 distinct groups. The Ab levels against each antigen for each group is shown on Fig. 3b. The heatmap and its dendrogram (Fig. 3) indicate that the specimens in each group can be further split into subgroups. Figure 4 separates Group 1 specimens into 2 subgroups (1.1 and 1.2) evident from the heatmap (Fig. 4a) and confirmed by the PCA (Fig. 4c). The bar graph (Fig. 4b) highlights a significant difference between subgroups 1.1 and 1.2 in reactivity against antigen S2 which is elevated in Group 1.2. Figure 5 separates Group 2 specimens into 2 subgroups (2.1 and 2.2) evident from the heatmap (Fig. 5a) and confirmed by PCA (Fig. 5c). The bar graph (Fig. 5b) highlights a significant difference between subgroups 2.1 and 2.2 against the nucleocapsid protein and S2 which are both elevated in Group 2.2). Figure 6 separates Group 3 specimens into two groups (3.1 and 3.2) evident from the heatmap (Fig. 6a) and bar graph (Fig. 6b) and confirmed by PCA (Fig. 6c). From the bar graph and heat map it is evident that the reactivity of group 3 specimens is less than that of the other two groups, but the subgroup 3.2 is particularly lacking in antibodies to SARS-CoV-2 antigens.

Figure 3
figure 3

Clustering analysis of the IgG reactivity to the COVAM SARS-CoV-2 antigens. On (A), heatmap showing the reactivity to the SARS-CoV-2 antigens. Samples were clustered using the hierarchical Clustering analysis. The dendrogram was cut (and color coded) to a final cluster number equal 3. On (B), a bar graph showing the mean reactivity and the standard error of each cluster to each individual SARS-CoV-2 antigen. On (C), principle component analysis (PCA) showing the spatial distribution of the samples for the first and second principal components that explain, combined, over 74% of the variance. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

Figure 4
figure 4

Group 1 Cluster Analysis and PCA demonstrates two subgroups. (A) Heatmap showing the reactivity to the SARS-CoV-2 antigens. Samples were clustered using the hierarchical clustering analysis. The dendrogram was cut (and color coded) to a final cluster number equal 3. On (B), a bar graph showing the mean reactivity and the standard error of each cluster to each individual SARS-CoV-2 antigen. (C) PCA analysis showing the spatial distribution of the samples classified as 1.1 and 1.2 for the first and second principal components. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

Figure 5
figure 5

Group 2 Cluster Analysis and PCA demonstrates two subgroups. (A) Heatmap showing the reactivity to the SARS-CoV-2 antigens. Samples were clustered using the hierarchical clustering analysis. The dendrogram was cut (and color coded) to a final cluster number equal 3. (B) A bar graph showing the mean reactivity and the standard error of each cluster to each individual SARS-CoV-2 antigen. (C) PCA analysis showing the spatial distribution of the samples categorized as subgroups 2.1 and 2.2 for the first and second principal components. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

Figure 6
figure 6

Group 3 Cluster Analysis and PCA demonstrates two subgroups. (A) Heatmap showing the reactivity to the SARS-CoV-2 antigens. Samples were clustered using the hierarchical clustering analysis. The dendrogram was cut (and color coded) to a final cluster number equal 3. (B) A bar graph showing the mean reactivity and the standard error of each cluster to each individual SARS-CoV-2 antigen. (C) PCA analysis showing the spatial distribution of the samples categorized as subgroups 3.1 and 3.2 for the first and second principal components. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

Multiplex antigen classification of CCP

The efficacy of CCP may vary depending on the donor, and there are different ways to classify the plasma before it is administered. One criterion is whether the donor was a PCR+ confirmed case. Factors such as clinical disease severity and the time from symptom onset or recovery to collection of CCP units may contribute to the classification. Antibody level is a measurement that can be used to qualify CCP and antibodies against different antigen targets including the total spike S1/S2, S1, S2, RBD and NP proteins can be considered. Virus neutralization assay titers are another metric and pseudoviruses have been developed for this purpose to make it more convenient than live SARS-CoV-2 virus neutralization assays for routine use. In order to use these quantitative measurements to qualify CCP for clinical administration, a threshold can be established, to qualify a unit of plasma as acceptable for clinical use or rejected.

A multiplex antigen classification of CCP can also be considered as a criterion to qualify and accept or reject donor plasma for transfusion to COVID-19 patients. For example, all of the highly reactive COVAM Group 1 specimens (Fig. 3) could be accepted and all of the low reactive Group 3 specimens rejected. Or subsets of any of the groups (Figs. 4, 5, 6) could be accepted or rejected based on knowledge of the clinical efficacy of each major group or subgroup.

The results plotted in Fig. 7 show how multiplex antigen classification differs from criteria based on changing the binary threshold of a single assay. In Fig. 7A the RVPN titer threshold was moved from the canonical standard 40, to 160, 320 and 640. Group 3 specimens increasingly fall below the cutoff as the threshold increases. Group 2 specimens are mostly above threshold at the 160 threshold and increasingly fall below the threshold as it is increased. Interestingly, all but one of the Group 1 specimens are below the highest RVPNT threshold of 640.

Figure 7
figure 7

SARS-CoV-2 Reactivity Classification. In both (A) (comparison to RVPNT NT50 thresholds) and (B) (comparison to VITROS CoV2T S/C thresholds), the bars across the top represent the classification for each given sample. The color red represents samples classified as reactive and the color blue, samples classified as non-reactive. The top 2 bars represent the COVAM reactivity predictions with the top bars the prediction based on the random forest model and the second bars the prediction from the logistic regression model. The bar graphs represent the % of samples classified as reactive on each cutoff value. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/) and MICROSOFT Excel version 2016, www.microsoft.com).

The VITROS CoV2T assay threshold can also be adjusted to select the specimens with higher signals (Fig. 7B). Moving the S/C threshold from 1 to 10 preferentially removes a high proportion of the low-level reactive Group 3 specimens. At the highest threshold of 150 all of the Group 3 specimens are removed while 40% and 30% of the Group 1 and 2 specimens, respectively, are above the threshold.

These assay-dependent and threshold-dependent differences in the classification of plasma specimens is further evident from the results in Fig. 8. Here the group 1, 2 and 3 specimens were separately sorted by the RVPNT titer (Fig. 8A). The Titer Cutoff lines on Fig. 8A shows how most of the Group 1 samples are above the threshold even at the highest 640 titer. The Group 2 and Group 3 specimen classifications are affected more by increasing the Cutoff titer. Similarly, the VITROS CoV2T data is plotted in Fig. 8B and three different assay cutoff values are shown. Random forest probabilities (for classification as “Reactive”) are plotted in Fig. 8C and three different assay and 0.6 (60%) as well as 0.8 (80%) cutoff values are shown as horizontal lines. On Fig. 8C, it is evident that the random forest prediction is capable of correctly classifying all samples from the groups 1 and 2 (higher and medium reactivity levels, however, as the reactivity level falls (Group 3) the random forest prediction, as well as the other methods, loses sensitivity. It is evident that the VITROS CoV2T intensity is not well correlated to the RVPNT titer, and scatterplots (not show) give R2 = 0.037. Increasing the VITROS CoV2T threshold cuts through all the groups. These results indicate that RVPNT titer, VITROS CoV2T titer, and the COVAM multiplex cluster analysis patterns select different collections of plasma specimens.

Figure 8
figure 8

SARS-CoV-2 RVPNT and VITROS-CoV2T titers. In both (A) (RVPNT NT50 titers) and (B) (VITROS CoV2T S/C titers), the samples were sorted by their cluster classification and the RVPNT titers. The red, horizontal lines represent different cutoffs being, for RVPNT 1:40; 1:160; 1:320 and 1:640; for VITROS CoV2T S/C > 1, 10, 50 and 150 and for the random forest prediction probability 0.6 and 0.8. This figure was generated using the R programing language, version 4.0.2 (https://www.r-project.org/).

Discussion

Passive antibody transfer using convalescent plasma has previously been used to treat infectious diseases that involve the respiratory system including influenza11,12. Prior experience in epidemics with convalescent plasma (CP) containing antibodies to viruses has demonstrated variable indications of therapeutic efficacy against Influenza, Argentine Hemorrhagic Fever, and SARS. The variation in efficacy was impacted by lack of ability to assess antibody properties13,14. Characterizing antibody titers in plasma to viruses has indicated a correlation with therapeutic efficacy. An expanded access protocol (EAP) clinical trial led by the Mayo Clinic has been completed resulting in > 65,000 units of apheresis plasma collected from convalescent COVID-19 patients and treating > 35,000 acutely infected COVID-19 in-patients. Anti-SARS-CoV-2 antibody levels in donor plasma were variable, and a small statistically significant (p < 0.05) improvement in the clinical outcome was reported in a subset of patients who were treated early after symptom onset and received the plasmas with higher Ab titers compared to patients who received plasma later in the disease course. The results indicated that the SARS-CoV-2 specific Ab content of CCP may be associated with improved clinical outcomes in the appropriate patient subset. Definitive safety and efficacy of CCP for the treatment of COVID-19 is awaiting completion of placebo controlled, prospective and randomized clinical trials (RCTs).

Clinical serodiagnostic tests usually measure antibody levels against a single antigen or epitope and a single Ig isotype. The output of current EUA SARS-CoV-2 Ab assays are a binary positive/negative, above or below a threshold with or without reporting intensity of reactivity levels above the threshold. The assay cutoff threshold is determined by probing a well-characterized collection of positive and negative control samples. Some tests of this type mix multiple antigens together and use secondary Abs that recognize multiple isotypes to produce an aggregate value with a binary positive/negative result.

Methodologies like COVAM that quantitatively interrogate multiple specific signal intensity levels independently to classify specimens were called “chemometrics” in the pre-genomic period. Modern multivariant statistical approaches that interrogate and classify a “fingerprint” of individual genomic measurements have been developed to classify genome sequence and expression data, and these computational methods can also be applied to clinical diagnostics. The test results from the COVAM assay described here take advantage of multivariant machine learning, pattern recognition, and clustering to classify plasma samples that have many antibodies at different levels against numerous SARS-CoV-2 antigens.

The COVAM was constructed to independently measure immunoglobulin isotype and subtype levels in serum or plasma samples against each of the individual antigens printed on the array. We use two computational approaches, either a generalized linear model (glm) or a random forest (RF) prediction model to classify individual specimens as either reactive or non-reactive against the SARS-CoV-2 antigens. We use a training set of 88 PreCoV specimens collected in July–September 2019 and positive specimens from PCR+ confirmed COVID-19 cases. We evaluated 99 coded CCP plasma samples and produced binary positive/negative results that were 96% concordant with the FDA EUA Ortho VITROS CoV2T immunoassay and 93% with the RVPNT assay. All of the discordances were weak responders around the threshold of the comparators.

Although the binary classification of COVAM, VITROS CoV2T and RVPNT assays are highly concordant, multivariant COVAM analysis of quantitative results reveals distinct differences among the 99 CCP specimens analyzed in this study. Unsupervised K-means analysis, as well as hierarchical clustering reveal three main clusters with distinct reactivity intensities and patterns. The dendrogram analysis reveals at least two additional subgroups within each of the major groups that separate based on reactivity patterns against the antigens on the array. We could not recapitulate these group classifications by simply moving the reactivity thresholds of either the VITROS CoV2T or RVPNT assays.

The variable reactivity patterns between individuals observed in this study is suggestive of polyclonal immune responses with different antibody specificities which should be examined in light of clinical outcomes in recipients of CCP transfusions; if predictive, the COVAM assay could be employed to qualify CCP for administration as a treatment for acute COVID. An analysis to identify antibody patterns associated with treatment efficacy can be done retrospectively on aliquots from plasma that have already been used for treatment in ongoing clinical trials of CCP and vaccines.

In summary, the COVAM antigen array have multiple applications that allow for the interrogation of the antibody response toSARS-CoV-2 and is a strategic tool that can be used to characterize the antibody profile to qualify CCP samples as well as immunogenicity and performance of vaccine preclinical and clinical studies, evaluate exposure, study vaccine breakthrough infections as well as guide recipient social behavior, and the need for additional immunizations.

Methods

Specimen testing on coronavirus antigen microarray

COVID-19 convalescent plasma (CCP) were collected by Vitalant April 18 through May 5 2020. Donors were qualified following FDA guidance. Evidence of COVID-19 was required in the form of a documented positive SARS-CoV-2 molecular or serologic test, and either complete resolution of symptoms at least 28 days prior to donation or complete resolution 14 days prior to donation with negative molecular test result. Donors were required to meet traditional allogeneic blood donor criteria per the Code of Federal Regulations (21 CFR 630.10 and 630.15). At time of plasma collection, donors consented to use of de-identified donor information and test results for research purposes. Apheresis plasma collection was performed with the Alyx (Fresenius Kabi USA LLC, Lake Zurich, IL, USA) and Trima (Terumo BCT, Lakewood, CO, USA) apheresis systems following manufacturers’ recommendations. Donor serum was prepared at the time of donation, frozen < − 18 °C, and maintained in the sample repository. Prospective CCP release criteria using the spike specific Ortho VITROS CoV2T SARS-CoV-2 antibody detection assay was implemented after 5/13/2020. Serum samples for this study were selected from the central Vitalant Research Institute’s repository without regard to any previous test results, donor medical history, collection location or other criteria representing collections from 7 regional collection sites across the United States.

The SARS-CoV-2 positive controls and negative controls are composed by 102 specimens collected from PCR positive individuals, collected from different sources and 88 specimens collected from healthy individuals prior to November 2019, before the beginning of the pandemic, as future research use authorized specimens from blood banks. A complete description of the control samples (here referred as reference or training set) can be found at Assis et al.15.

The COVAM slides9,10 were probed with human plasma and analyzed as described elsewhere16,17,18. Briefly, the specimens were diluted 1:100 in 1X Protein Array Blocking Buffer (GVS Life Sciences, Sanford, ME), transferred to the slides and incubated overnight at 4 °C. The slides were then washed 3 times for 5 min each with t-TBS buffer (20 mM Tris–HCl, 150 mM NaCl, 0.05% Tween-20 in ddH2O adjusted to pH 7.5 and filtered) at room temperature (RT). Then, anti-human IgG and anti-human IgA Secondary Ab were added to each pad at a dilution of 1:100 (in Protein Array blocking Buffer) and incubated for 2 h at RT under agitation. Pads were then washed with t-TBS 3 times for 5 min each and dried. The slides were imaged using ArrayCam imager (Grace Bio-Labs, Bend, OR, USA). In order to measure non-specific binding of the secondary Ab, pads were incubated without the previous addition of human sera. The data acquisition and spot quantification were performed using the Scan ArrayExpress (V 3.0, PerkinElmer) software16,17,18.

Specimen sources and prior characterization

All CCP subjects have consented to the use of blood donor Information, blood, and blood samples in research. This was determined as exempt by ADVARRA, Columbia, MD. samples were collected in accordance to the January 19, 2017 Final Common Rule, Federal Policy for the Protection of Human Subjects. All methods and protocols were approved by the University of California Irvine Human Research Protections (IRB/HRP). Informed consent was obtained from all participants and all samples were unidentified.

In order to build the prediction models, available data from samples with known exposure status to SARS-CoV-2 was used as a training set10,19. This data set was built from 88 PreCoV specimens derived from frozen plasma components collected by Vitalant in July–September 2019, and 102 specimens from PCR confirmed cases was used for these analyses. These confirmed positive cases include 45 serum samples from the University of California Irvine collected between 7 and 25 days after the symptom onset; 30 samples from the University of California San Francisco collected between 2 and 38 days after the symptoms onset; 13 samples from Basel collected between 13 and 50 days after the symptoms onset and 14 samples from Ortho Clinical Diagnostics collected between 7 and 22 days after the symptoms onset. The CCP samples were a collection of 99 SARS-CoV-2 convalescent plasma samples from the Vitalant system collected between 4/18/2020 and 5/6/2020 from 8 regions across the US.

Neutralization titers were measured as 50% neutralization (NT-50) by endpoint titration using a recombinant viral-particle neutralization test (RVPNT) and a cutoff equal to 1:40. The VITROS CoV2T chemiluminescent immunoassay assay was performed using either serum or plasma samples from Vitalant Research Institute (San Francisco, CA, USA) according to the manufacturer instructions10. The test targets the spike protein and applies a predefined threshold value of 1.0 signal-to-cutoff (S/C) for IgG seropositivity but has a broad dynamic range with S/C values as high as 1000.

Data analysis and normalization

All data analysis was performed and figures generated in the R programing environment (Version 4.0.2, https://www.r-project.org/). Figure 7 was generated both using the R programing environment and MICROSOFT EXCEL (Version 2016, https://www.microsoft.com).

After data acquisition, data normalization was performed by the Quantile Normalization method using the “normalize.quantiles.use.target” function of the “preprocessCore” package (Version 1.50.0). As reference for normalization, a collection of known positive and known negative samples (training set comprised of serial samples from recovered COVID-19 patients and plasma collected in the July–September 2019, respectively) was used.

For vsn normalization, first a model was created with the training set using the function “vsn” then, this model was used to normalize the CCP samples (package vsn version 3.56.0).

Cluster analysis

To investigate the different reactivity profiles, the data were clustered and divided based on the reactivity to the 11 SARS-CoV-2 antigens. For the clustering, first the optimal number of clusters was estimated by the gap statistics on the K-means clustering analysis (“fviz_nbclust” function from the factoextra package (version 1.0.7). Then, a Hierarchical Clustering analysis (“hclust” function, method “ward.D2”, from the “stats” package, version 4.0.2) was performed and the dendrogram cut in order to obtain a number of cluster defined by the previous calculated gap statistics. The assigned groups were obtained by cutting the dendrogram to obtain K = n groups, where n is the optimal number of clusters obtained from the gap statistics analysis.

Principal component analysis (PCA) was also performed (“prcomp” function from the “stats” package version 4.0.0 as well as the “fviz_nbclust” function from the “factoextra” package, version 1.0.7).

Reactivity classification

With goal of predicting exposure to SARS-CoV-2, the samples were first classified based on their overall reactivity profile. The prediction was performed using two main computational methods, a logistic regression model and a random forest prediction model.

For the logistic regression, a generalized linear model (“logit” family) was first generated (“glm” function from the “stats” package version 4.0.2) using the training set of. Then a receiver operating characteristic curve (ROC) was generated in order to obtain all the curve coordinates. This allows estimation of the specificity and sensitivity for each point of the curve and therefore defines an optimal cutoff point of the regression analysis fitted values. For the predictions described in this work, the model was built using four antigens: SARS-CoV-2 S1, SARS-CoV-2 S1.HisTag, SARS-CoV-2.S1.RBD, SARS.CoV.2.S1+S2. The test samples are then submitted to the logistic regression analysis and the fitted values compared to the defined cutoff for classification (positive or negative).

For the random forest analysis, like the logistic regression method, a model fit is generated (random forest version 4.6-14). For the random forest model, seven antigens were used: SARS-CoV-2 NP, SARS-CoV-2 S1.HisTag, SARS-CoV-2 Spike.RBD.rFc, SARS.CoV.2.S1.mFcTag, SARS.CoV.2.S1.RBD, SARS.CoV.2.S1+S2, SARS.CoV.2.S2.