Dear Editor,

Coronavirus disease 2019 (COVID-19) presents in various ways [1]. Recently in the journal, Legrand and colleagues identified three distinct clinical sub-phenotypes of COVID-19, which may help recognize patients at high risk of deterioration [2]. Earlier work in sepsis has shown that clinical phenotypes may help understand the heterogeneity in disease presentation and inform trial design [3, 4]. The retrospective cohort study of Legrand et al. consisted of 893 patients of which 608 were used for cluster analysis, after excluding patients with missing data. Their thorough selection yielded 22 candidate variables for cluster analysis, including disease history, demographics, symptoms and concomitant medication.

We aimed to validate the findings by Legrand et al. in our Dutch CovidPredict cohort. This cohort consisted of COVID-19-positive patients admitted to ten teaching hospital across the Netherlands. COVID-19 was defined as a positive SARS-CoV-2 PCR or CORADS score of at least four [5]. Patients were included between 27 February and 4 December 2020. Approval was granted by the Institutional Review Board of the Amsterdam University Medical Centers (20.131).

We included 2019 patients and used similar candidate variables and number of clusters as Legrand et al. [2] (see variables in the supplementary file). In total, 657 patients were treated in the intensive care unit (ICU) or died during the following 21 days of COVID-19. Three sub-phenotypes were identified, which are presented in Fig. 1 (see supplementary Table 1 and 2 for baseline characteristics, and Figure 1–3 for cluster characteristics).

Fig. 1
figure 1

Chord diagrams of the distributions of traits within the three sub-phenotypes in hospitalized patients with COVID-19. In these chord diagrams, the ribbons connect from the phenotype to the variables described. The proportion on the circle represents which group is more likely to have these characteristics traits (age, sex, symptoms, comorbidities and medication). GI:  gastro-intestinal, CCD: chronic cardiovascular disease, RAASi: renin–angiotensin–aldosterone system inhibitor

Sub-phenotype 1 (n = 592) mainly included young (median age 63 [IQR = 53–74]) females (74.5%), characterized by a high prevalence of gastro-intestinal complaints (84.3%) and sputum production (63%). Comorbidities and medication usage were scarce. The composite outcome of ICU admittance/death rates was relatively low compared to the other groups (24.7%).

Sub-phenotype 2 (n = 876) included more males (80.4%) with a median age of 63 [IQR = 53–73.1] years, few comorbidities and the lowest medication usage of all three groups. Patients presented with less symptoms than those in sub-phenotype 1, but ICU admittance/death rates were higher (31.2%).

Sub-phenotype 3 (n = 551) mostly consisted of older (median age 76 [IQR = 69.1–81.1]) males (80.4%) with multiple comorbidities, mainly diabetes (62.4%), hypertension (87.7%) and other cardiovascular diseases (71.5%), and consequent medication usage. Patients reported less symptoms such as dyspnea (67%), headache (8.7%) and myalgia (11.6%). ICU admission and/or 21-day mortality occurred in 43.2% of patients.

In parallel with Legrand et al., sub-phenotype 1 was characterized by a large percentage of women and had the most favorable outcome. Sub-phenotype 3 differentiated itself by an older age together with a higher prevalence of comorbidities and a most unfavorable outcome. The distributions of clinical characteristics were largely comparable to the original study across all sub-phenotypes. Notable differences with Legrand et al. were the relatively low age and percentages of women in sub-phenotype 2. We speculate that some female patients who were clustered as sub-phenotype 2 in the original study were clustered into sub-phenotype 1 in our study, perhaps due to slight differences in the prevalence of baseline characteristics in our more severely ill population.

We believe the main value of these sub-phenotypes lies not with their ability to discriminate between clinical outcomes, but in their potential to understand disease heterogeneity and find more homogeneous patient subgroups that may respond more similarly to certain treatments.

In conclusion, our large multicenter cohort of hospitalized COVID-19 patients showed largely similar distributions of the characteristics as Legrand et al. found, albeit in a more severely ill population. We validated the robustness of these three clinical phenotypes, which are strongly related to clinical outcomes.