Dear Editor,

While studies have established risk factors for clinical deterioration in coronavirus disease 2019 (COVID-19) patients [1, 2], or attempted to identify phenotypes based on experts opinion [3], identifying sub-phenotypes based on more easily obtained data could help identify patients at highest risk of clinical deterioration and refine inclusion of more homogeneous subpopulations in clinical trials. We here applied an unsupervised, multivariate clustering algorithm using easy-to-obtain clinical variables to identify COVID-19 sub-phenotypes and examined the association with clinical deterioration.

This retrospective cohort study was performed among adult COVID-19-positive patients (using real-time reverse transcriptase–polymerase chain reaction assay) with a hospital visit between February 28 and March 26, 2020, at eight teaching hospitals of the Assistance Publique-Hôpitaux de Paris. The Institutional Review Board (IRB) of Ile-de-France VII approved the study and waived the need for informed consent from individual patients (DC 2009/CO-15-000). We selected 22 candidate variables for the clustering analysis including demographic information among 608 patients with available candidate variables, disease history, major clinical symptoms, and medications on the day of positive diagnostic, which represents the final cohort (Supplementary file). We applied an unsupervised consensus clustering method and determined the optimal number of clusters, which we also refer to as sub-phenotypes (Supplementary File) [4, 5]. We evaluated the association between the sub-phenotypes and with clinical deterioration defined as ICU admission and/or death within 28 days. Eight hundred and ninety-three patients were enrolled (Supplementary file), 50% required hospital admission, 104 (11.6%) patients were treated in the ICU, and 100 (11.2%) patients died.

We identified three distinct sub-phenotypes of patients seen at the hospitals participating in this study. Major sub-phenotype determinants are illustrated in Fig. 1 and Supplementary Fig. 2. Biological results within each sub-phenotype are presented in Supplementary Table 2.

Sub-phenotype #1 (n = 179) included mostly younger (median age 44 [IQR = 23.4]) women (74.9%), with no or few comorbidities (on average 0.5 comorbidity per patient) that were rarely on renin–angiotensin–aldosterone system inhibitors (RAASi) (8%), presenting with fever (56%), dyspnea (42%) or cough (78%) and numerous non-respiratory symptoms (mean 3.2/patient), including myalgia (82%), headaches (71%), and gastrointestinal symptoms (54%).

Fig. 1
figure 1

Chord diagrams highlighting clinical variable differences by sub-phenotypes. Across the sub-phenotypes, the distribution of: age (A) and sex (B), symptoms (C), and comorbidities (D) are shown

Sub-phenotype #2 (n = 279) included both men (54.1%) and women, with a median age of 53 [IQR = 26.4] years, with few or no comorbidities (mean 0.66/patient). Patients were rarely on RAASi (97%). While some had respiratory symptoms (dyspnea 35%, cough 57%), few had non-respiratory symptoms (mean 0.8/patient, i.e., myalgia 15%, headaches 15%, gastrointestinal symptoms 13%).

Sub-phenotype #3 (n = 150) included mostly male (70.7%) older patients (median age 73 [IQR = 19.3]) with more comorbidities (mean 2.2/patient), pervasive chronic hypertension (94%), and frequent treatment with RAASi (67%). A minority of patients in sub-phenotype #3 presented with fever (23%) or pulmonary symptoms (dyspnea 45%, cough 42%), and rarely other systemic symptoms (mean 0.65/patient, i.e., myalgia 13%, headaches 7%, gastrointestinal symptoms 19%). ICU admission and/or death occurred in 8%, 18%, and 43% of the patients in sub-phenotypes #1, #2, and #3, respectively (supplementary Fig. 1). 7%, 13%, and 29% of patients required ICU admission in sub-phenotype #1, #2, and #3, respectively. In each respective sub-phenotype, 3%, 9%, and 22% of patients died.

In conclusion, we identified three sub-phenotypes, mostly determined by a history of chronic hypertension, the presence of fever, respiratory and non-respiratory symptoms, and age. These sub-phenotypes were strongly associated with clinical deterioration. The results of this clustering analysis should be now validated in other cohorts.