Introduction

Obesity is associated with increased mortality and is a well-known risk factor for chronic conditions, such as diabetes, hypertension, cardiovascular disease, and cancer [1, 2]. Due to its proinflammatory state that impairs the immune response, obesity has also been related to an increased risk of viral infections [3]. The novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), emerged in December 2019 in Wuhan, China, and rapidly spread around the world [4]. This new virus causes a respiratory tract infection with clinical manifestations ranging from asymptomatic/mild symptoms to severe illness requiring intensive services. Partly due to its similarities with other viral infections such as seasonal influenza or H1N1, people with obesity were soon labeled as “at-risk” individuals [5]. Since obesity is a worldwide public health priority, granular information on patients with COVID-19 and obesity is needed to guide preventive strategies as well as to generate hypotheses for etiological studies [6].

A review and meta-analysis of 75 studies reported that obesity is a risk factor for testing positive for SARS-CoV-2, for severe COVID-19 and for COVID-19 related mortality [7]. While undoubtedly relevant to the field, these studies mainly focused on exploring multiple risk factors related to COVID-19 and thus did not offer a detailed characterization of patients with COVID-19 living with obesity. For instance, an exhaustive description of the medical conditions and COVID-19 related outcomes, such as thromboembolic events, among these patients is lacking. Other current limitations include the susceptibility to collider bias of studies reporting “risk factors” of COVID-19 infection and progression due to sampling mechanisms (e.g., subsamples of tested or hospitalized populations) [8]. A large characterization study focussing exclusively on patients with COVID-19 living with obesity using real-world data from different health settings and countries could address the limitations of the previous evidence.

In this study, we aimed to describe and compare the demographics, medical conditions, and outcomes of COVID-19 patients living with obesity (PLWO) to those of COVID-19 patients living without obesity, in inpatient or outpatient settings.

Methods

Study design, setting, and data sources

We conducted a multinational cohort study using routinely collected healthcare data from January to June 2020 from Spain, the United Kingdom (UK), and the United States (US). This study was part of the “Characterizing Health Associated Risks, and Your Baseline Disease In SARS-COV-2 (CHARYBDIS)” study (protocol available for download at https://www.ohdsi.org/wp-content/uploads/2020/07/Protocol_COVID-19-Charybdis-Characterisation_V5.docx) designed by the Observational Health Data Sciences and Informatics (OHDSI) community. All data were standardized to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) [9]. The OHDSI network maintains the OMOP-CDM, along with a wide range of tools developed by its members to facilitate analyses of mapped data [10]. Data results for this study were extracted on July, 16th, 2020.

We included primary, outpatient and inpatient care data from electronic health records (EHRs) and health insurance claims data from six databases. Data from Spain included the Information System for Research in Primary Care (SIDIAP), which includes primary linked to inpatient care data covering approximately 80% of the population in Catalonia, Spain [11]. The UK data covered the Clinical Practice Research Datalink (CPRD), with patients from over 600 general practices in the UK [12]. Data from the US included: Columbia University Irving Medical Center (CUIMC), covering New York-Presbyterian Hospital and its affiliated physician practices; IQVIA Open Claims, which are pre-adjudicated claims collected from office-based physicians and specialists covering over 300 million lives (~80% of the US population); the Stanford Medicine Research Data Repository (STARR-OMOP), with data from Stanford Health Care [13], and the United States Department of Veterans Affairs (VA-OMOP), covering the national Department of Veterans Affairs health care system which serves more than 9 million enrolled Veterans (of whom 93% are male). A more detailed description of the included data sources is available in Supplementary Appendix 1.

Study participants

We included two non-mutually exclusive cohorts of patients: (1) all patients diagnosed with COVID-19 (clinical diagnosis and/or positive test for SARS-CoV-2), and (2) all patients hospitalized with a COVID-19 diagnosis. We considered clinical diagnoses in the definition of COVID-19 cases due to testing restrictions during the first months of the pandemic (e.g., in Spain) [14]. The diagnostic codes used are described in Supplementary Appendix 2. Patients hospitalized with COVID-19 were identified as those having a hospitalization episode along with a clinical diagnosis or positive SARS-CoV-2 test within a time window from 21 days prior to admission up to the end of their hospitalization. We chose this time window to include patients with a diagnosis prior to hospitalization and to allow for a record delay in test results or diagnoses [15]. We included individuals with at least one year of observation time prior to the index date to capture observed baseline characteristics. In the diagnosed cohort, the index date was defined as the date of the COVID-19 clinical diagnosis or the earliest test day registered within seven days of a first positive test, whichever occurred first. In the hospitalized cohort, the index date was the day of hospitalization. Patients were followed from the index date to the earliest of death, end of the observation period, or 30 days [16].

Both the diagnosed and hospitalized COVID-19 cohorts were stratified by obesity status: PLWO vs patients living without obesity (from now on, referred to as patients without obesity). Obesity was defined as having an ever-recorded obesity diagnosis (Supplementary Appendix 3) and/or a body mass index (BMI) measurement between 30 and 60 kg/m2 and/or a bodyweight measurement between 120 and 200 kg prior or at index date. We included upper cut-off thresholds to discard implausible observations. Patients without obesity were those who did not fulfill the obesity definition.

Baseline characteristics and outcomes of interest

Demographics (sex and age) were obtained at the index date. More than 15 000 medical conditions from up to one year prior to the index date were identified based on the Systematized Nomenclature of Medicine (SNOMED) hierarchy, with all descendant codes included [15]. Specific definitions for comorbidities of particular interest were created; the detailed definitions of these variables can be consulted in Supplementary Appendix 3. We reported here a list of key comorbidities based on their prevalence in the cohorts of the participating sites, as well as on their clinical relevance to obesity and the COVID-19 research field [17].

Our 30-day outcomes of interest for the diagnosed cohort were hospitalization and fatality. For the hospitalized cohort, the 30-days outcomes were a requirement of intensive services (IS) (identified by a recorded mechanical ventilation and/or a tracheostomy and/or extracorporeal membrane oxygenation procedure), respiratory, cardiovascular, thromboembolic, and other events and fatality.

Data analysis

We described the number of patients included and the prevalence of obesity in each database as well as the demographics, comorbidities, and outcomes as proportions (calculated by the number of persons within a given category, divided by the total number of persons) with their respective 95% confidence intervals (CIs) for each database, by obesity status. To calculate these proportions in each database, we established a minimum count required (of five individuals), to minimize the risk of re-identification of patients. To compare medical conditions across groups, we calculated standardized mean differences (SMDs) [18], which we summarized in Manhattan-style plots. The SMD can be used to compare the prevalence of a dichotomous variable between two groups and is independent of sample size [19]. A |SMD| > 0.1 indicates a meaningful difference in the prevalence of a given condition; in the context of this study, a SMD > 0.1 indicates a higher prevalence in PLWO, whereas a SMD < −0.1 indicates a higher prevalence among patients without obesity. This study was descriptive by nature and, therefore, statistical modeling was out of scope. Differences across the groups compared should not be interpreted as causal effects.

To ensure data privacy at all times, we employed a federated analysis approach [16]. Following a pre-specified analysis plan, a common analytical code for the whole CHARYBDIS study was developed for the OHDSI Methods library, available at https://github.com/ohdsi-studies/Covid19CharacterizationCharybdis, and was run locally in each database. Individual-level data remained within host institutions and only aggregate results from each database were provided to the research team and publicly shared. All the results reported in this paper and additional data are available for consultation at a dynamic and interactive website, which changes over time as new databases are added and/or results are updated to CHARYBDIS (https://data.ohdsi.org/Covid19CharacterizationCharybdis/).

We used R version 3.6 for data visualization. All the data partners obtained Institutional Review Board (IRB) approval or exemption to conduct this descriptive study.

Results

Prevalence of obesity

We included 627 044 diagnosed and 160 013 hospitalized patients with COVID-19 (Table 1). The diagnosed cohort consisted of 122 058 patients from Spain (SIDIAP), 2336 from the UK (CPRD), and 502 650 from the US (CUIMC: 8519; IQVIA-OpenClaims: 466 191; STARR-OMOP: 3328; VA-OMOP: 24 612). The hospitalized cohort included 18 197 patients from Spain (SIDIAP) and 141 816 from the US (CUIMC: 2600; IQVIA-OpenClaims: 133 091; STARR-OMOP: 615; VA-OMOP: 5510). Among diagnosed and hospitalized patients, 207 859 (33.1%; 95%CI: 33.0−33.2) and 63 866 (39.9%, 95%CI: 39.8−40.0) had obesity, respectively. In all databases, the prevalence of obesity was lower among diagnosed patients than among those hospitalized, with differences ranging from 5 (IQVIA-OpenClaims) to 16% (SIDIAP).

Table 1 Demographic characteristics of patients diagnosed and hospitalized with COVID-19 in each database, stratified by obesity status.

Baseline demographics

The sex distribution (proportions and 95% CIs) of the patients are reported in Table 1. Aside from VA-OMOP, in the diagnosed cohort, patients with and without obesity were mostly female. The proportion of females was higher among PLWO compared to patients without obesity in SIDIAP (63% vs 56%), CUIMC (61% vs 56%), and IQVIA-OpenClaims (61% vs 52%), while in VA-OMOP the opposite was observed (13% vs 19%). No differences were observed in CPRD and STARR-OMOP. In the hospitalized cohort, patients without obesity were predominantly male (female ranged from 40 to 49%, VA-OMOP: 4%) but PLWO still were more commonly female in all databases aside from VA-OMOP (range: 51−55%, VA-OMOP: 7%). Differences in the proportion of females between PLWO and patients without obesity ranged from 3 (VA-OMOP) to 15% (CUIMC).

The age distribution in each database is summarized in Table 1 with proportions and their respective 95% CIs and in Fig. 1 with histograms. In the diagnosed cohort, PLWO were slightly older than those without obesity (i.e., the age distribution for PLWO was slightly skewed to the left compared to patients without obesity). This was particularly marked in SIDIAP, where 40% of the PLWO were aged above 65 years and only 20% were so without obesity. Hospitalized patients were older than those diagnosed. In the hospitalized cohorts, PLWO were fairly consistently younger than those without obesity (except for SIDIAP). The proportion of patients aged above 65 ranged from 36 to 63% for PLWO and from 43 to 73% for those without obesity.

Fig. 1: Distribution of age among patients living with and without obesity in each database, stratified by COVID-19 cohort type (diagnosed and hospitalized).
figure 1

CPRD Clinical Practice Research Datalink, COVID-19 coronavirus disease 2019, CUIMC Columbia University Irving Medical Center, SIDIAP Information System for Research in Primary Care, STARR-OMOP Stanford Medicine Research Data Repository, VA-OMOP United States Department of Veterans Affairs.

Baseline medical conditions

We compared baseline medical conditions of PLWO to those of patients without obesity in the diagnosed and hospitalized cohorts using SMDs, which are summarized in Fig. 2. We depicted the SMDs of 485 (CPRD) to 5050 (VA-OMOP) medical conditions in the diagnosed cohort, and 529 (STARR-OMOP) to 5240 (IQVIA-OpenClaims) in the hospitalized cohort. In both cohorts, medical conditions were largely more frequent among PLWO than patients without obesity.

Fig. 2: Standardized mean differences in conditions among patients living with obesity compared to patients living without obesity in each database, stratified by COVID-19 cohort type (diagnosed and hospitalized).
figure 2

SMD < 0 means the prevalence was greater in COVID-19 patients living without obesity, SMD > 0 means the prevalence was greater in COVID-19 patients living with obesity. COVID-19 coronavirus disease 2019, CPRD Clinical Practice Research Datalink, CUIMC Columbia University Irving Medical Center, SIDIAP Information System for Research in Primary Care, SMD standardized mean difference, STARR-OMOP Stanford Medicine Research Data Repository, VA-OMOP United States Department of Veterans Affairs.

The distribution of the selected key comorbidities is shown in Fig. 3, and the proportions with their respective 95% CIs and SMDs between PLWO and patients without obesity are available in Supplementary Appendices 4 and 5. In the diagnosed cohorts, PLWO consistently had a higher prevalence of comorbidities compared to those without obesity; these differences were meaningful (i.e., with a SMD > 0.1, which indicates a meaningfully higher prevalence among PLWO) for the majority of comorbidities across databases. For example, while the prevalence of hypertension for PLWO ranged from 30 to 32% in Europe (SIDIAP and CPRD) and from 55 to 81% in the US, in those without obesity it ranged from 12 to 16% and from 26 to 53%, respectively. The SMD for hypertension was above 0.1 in all databases. As in the diagnosed cohort, PLWO hospitalized with COVID-19 had a higher prevalence of comorbidities than those without obesity, and these differences were meaningful for the majority of comorbidities. However, the differences between groups were less obvious. For example, heart disease differed by 20% among those diagnosed in VA-OMOP (PLWO: 60%, without obesity: 40%) and by 9% among those hospitalized (PLWO: 74%, without obesity: 65%); although the SMD was still above 0.1 in all databases.

Fig. 3: Comorbidities at baseline among patients living with obesity compared to patients living without obesity in each database, stratified by COVID-19 cohort type (diagnosed and hospitalized).
figure 3

Prevalence of comorbidities for COVID-19 patients living with obesity (red) and without obesity (blue) are depicted in overlapped horizontal bars. The gray color is the overlap between groups. E.g., in CPRD, 32% of COVID-19 patients living with obesity and 16% living without obesity have hypertension. Comorbidities with a meaningful difference (|SMD| > 0.1) between patients living with and without obesity are marked with an asterisk (*). COPD chronic obstructive pulmonary disease, COVID-19 coronavirus disease 2019, CPRD Clinical Practice Research Datalink, CUIMC Columbia University Irving Medical Center, SIDIAP Information System for Research in Primary Care, SMD standardized mean difference, STARR-OMOP Stanford Medicine Research Data Repository, VA-OMOP United States Department of Veterans Affairs.

30-day outcomes of interest

The distribution of 30-days outcomes is shown in Fig. 4, the proportions with their respective 95% CI and SMDs between PLWO and patients without obesity are available in Table 2. In the diagnosed cohorts, hospitalization rates were higher among PLWO than among those without obesity in all databases. For example, in SIDIAP the proportion of patients hospitalized was 20% for PLWO and 10% for patients without obesity. However, these differences were meaningful (SMD > 0.1) only in three databases: SIDIAP, CUIMC, and STARR-OMOP. In PLWO, fatality ranged from 5 to 12% and was higher than in patients without obesity in SIDIAP and CUIMC (7% vs 3% and 8% vs 5%, respectively), while in CPRD and VA-OMOP it was similar in both groups. SIDIAP was the only database with a meaningful difference in the proportion of fatality.

Fig. 4: A comparison of 30-day events among patients living with and without obesity in each database, by COVID-19 cohort type (diagnosed and hospitalized).
figure 4

Proportion of outcomes for COVID-19 patients living with obesity (red) and without obesity (blue) are depicted in overlapped horizontal bars. The gray color is the overlap between groups. E.g., in the diagnosed cohort, 20 and 10% of patients living with and without obesity in SIDIAP, respectively, were hospitalized. Outcomes with a meaningful difference (|SMD| > 0.1) between patients living with obesity and patients without obesity are marked with an asterisk (*). ARDS acute respiratory distress syndrome, COVID-19 coronavirus disease 2019, CPRD Clinical Practice Research Datalink, CUIMC Columbia University Irving Medical Center, SIDIAP Information System for Research in Primary Care, SMD standardized mean difference, STARR-OMOP Stanford Medicine Research Data Repository, VA-OMOP United States Department of Veterans Affairs.

Table 2 Occurrence of 30-day events, in % (95%CI), among patients living with and without obesity in each database, by COVID-19 cohort type (diagnosed and hospitalized).

Overall, in the hospitalized cohort, PLWO more frequently had adverse events occurring in the 30 days after the index date than patients without obesity. For example, PLWO required IS and presented with ARDS more frequently than patients without obesity in the largest databases: IQVIA-OpenClaims (IS: 13% vs 10%; ARDS: 35% vs 31%) and VA-OMOP (IS: 22% vs 15%; 46% vs 41%), whereas in CUIMC and STARR-OMOP percentages were similar. VA-OMOP was the only database with a meaningful difference in the proportion of IS. Similarly, heart failure was also more frequent among PLWO than among patients without obesity in CUIMC: 7% vs 3%, IQVIA-OpenClaims: 7% vs 5%, STARR-OMOP: 16% vs 9%, and VA-OMOP: 23% vs 17%), these differences were meaningful in CUIMC and STARR-OMOP. Sepsis, cardiac arrhythmia, and cardiovascular disease events were slightly more frequent among PLWO, although SMDs were below 0.1 in all databases. Acute kidney injury was the only outcome that was more frequent among patients without obesity; however, this difference was not meaningful in any database. As for fatality, there were no consistent nor meaningful differences between PLWO and patients without obesity in the hospitalized cohort: while it was higher for PLWO in SIDIAP (14% vs 11%), there were no differences in CUIMC (20% vs 21%) nor in VA-OMOP (16% vs 18%).

Discussion

In this large cohort study including 627 044 COVID-19 patients from Spain, the UK, and the US, we found that the prevalence of obesity was higher among COVID-19 patients hospitalized (40%) compared to those diagnosed (31%). PLWO diagnosed and hospitalized with COVID-19 were more commonly female, and those hospitalized were younger than patients without obesity. The extraction of more than 15 000 medical conditions revealed PLWO were not only more prone to have obesity-related comorbidities, such as hypertension, heart disease, and type 2 diabetes but also to more than a thousand different health conditions. After 30-days of follow-up, PLWO presented with higher hospitalization rates and intensive services requirements, although these differences were only meaningful in some databases.

Our study has several strengths, such as its large amount of data. By bringing together harmonized data using a federated approach, we have conducted a large-scale study while respecting the confidentiality of patient records. The international approach of this study is a strong asset given that we are investigating the intersection of two major global threats, namely the obesity epidemic and the COVID-19 pandemic. The former, together with the diverse healthcare settings and populations described in this study, increase the generalizability of our findings. Further, we provide a wide overview of the characteristics and outcomes of patients with and without obesity, using data visualization tools to summarize large amounts of medical data. This exhaustive characterization goes far beyond prior studies reporting few comorbidities and supports the generation of new hypotheses that can be tested in future studies. In addition, for the sake of transparency and reproducibility, we have made methods, tools, and all results publicly available. As CHARYBDIS is an ongoing study, results (included longer follow-up time) will be updated and new studies focussing on obesity could be conducted. All of the above has been accomplished through the coordinated efforts of the OHDSI community to provide a rapid response to the COVID-19 pandemic.

Our study also has limitations. First, we cannot exclude a selection bias of COVID-19 cases due to underreporting in the context of testing restrictions and asymptomatic or paucisymptomatic cases that usually do not seek medical care. Additionally, testing policies have varied across countries and time depending on the course of the pandemic. Nevertheless, the inclusion of patients clinically diagnosed (not tested) in different settings likely provided consistency to our data, although it might have incurred in false positives. Second, we did not have information on BMI as a continuous variable, which prevented us from investigating the impact of different categories of obesity in COVID-19 outcomes. This might explain the higher proportion of comorbidities and outcomes observed in the US databases, as PLWO from the US might have higher BMIs than those from Europe [20]. In addition, our definition of obesity included diagnoses and measurements recorded at any time prior to or at the index date, and therefore some individuals might have been misclassified due to changes in BMI since the most recently recorded status. However, previous evidence shows that BMI trajectories in adults are relatively stable, with a tendency to increase with age [21]. Therefore adults with obesity are likely to still have obesity over time. Finally, this study was underpinned by routinely collected data which can raise concerns about the quality of the data. Some databases are prone to oversampling certain groups of people as a result of how these data are captured (e.g., the Veterans Affairs system historically serves more men than women, routine claims data may only reflect health outcomes in commercially insured populations, etc.). Obesity, comorbidities, and outcomes were assessed based on having a record of a condition/measurement, therefore they may be underestimated. In addition, outcomes such as hospitalization or intensive services requirements are also influenced by factors external to the patient’s condition (i.e., bed availability, criteria for admission), which might differ across databases. Even still, the consistency of our findings across databases that differ by setting and country lends credence to the generalizability of our findings.

Given the prevalence of obesity in Spain (24%), the UK (27%), and the US (37%), a high proportion of PLWO among COVID-19 cases was expected [20]. However, the prevalence of obesity among diagnosed COVID-19 patients was higher than the general population in four databases: SIDIAP (Spain): 30%; CPRD (UK): 42%; CUIMC and VA-OMOP (US): 41 and 47%, respectively, which is suggestive of an increased risk of diagnosis in PLWO. In addition, the prevalence of obesity was higher in hospitalized COVID-19 patients, with an overall prevalence of obesity of 40%, which is in line with three cohort studies from the US that reported that 40, 42, and 48% of inpatients were living with obesity [17, 22, 23]. A large meta-analysis of observational studies reported that obesity is associated with a higher risk of testing positive for SARS-CoV-2 or being diagnosed with COVID-19 as well as of being hospitalized with COVID-19 [7]. While this could be due to an increased vulnerability to SARS-CoV-2 in PLWO, other hypotheses should be considered in future studies. On the one hand, individuals with obesity could be more likely to seek care and be tested for SARS-CoV-2 since they are (presumably) a high-risk population, have multiple comorbidities, and are more prone to respiratory symptoms due to their compromised pulmonary function [2, 7]. On the other hand, given the fact that obesity disproportionately affects disadvantaged populations, potential differential exposures across subpopulation groups should also be explored (e.g., differential occupational risks) [2].

Women predominated among hospitalized patients with obesity, even though obesity rates are similar in both sexes in the three countries [20]. Although male sex is a well-established risk factor for COVID-19-related hospitalization and death, little is known about the role of obesity on COVID-19 outcomes stratified by sex [14, 23,24,25]. Recent studies addressing this issue in secondary analyses have reported inconsistent results. A study conducted among UK Biobank participants found that the impact of BMI in COVID-19-related death was higher among females compared to men, while others have found a higher effect among males, opposite effects of sex in different age strata or null differences [26,27,28,29]. Thus, the intersection between sex/gender and obesity in relation to COVID-19-outcomes warrants further investigation. Because sex-stratification was beyond the pre-specified analysis plan of our study, we were unable to report our results by sex, which could have provided valuable insights on the matter. We intend, however, to address this issue in upcoming studies from the CHARYBDIS project.

We also found that hospitalized PLWO were younger than those without obesity. Although younger individuals have less risk of infections and complications than older people due to having fewer comorbidities and a stronger immune system, this is not the case for those with obesity [2, 7, 30,31,32]. Some authors have postulated that PLWO younger than 60 years could have a greater risk of severe COVID-19 outcomes [33]. PLWO also had many more comorbidities than patients without obesity. Unsurprisingly, the highest differences were observed in obesity-related conditions, such as hypertension, diabetes, and heart disease, which have been identified as risk factors for severe COVID-19 outcomes [14, 17, 25, 34, 35]. However, as our findings revealed, PLWO with COVID-19 differ from patients without obesity in a wider range of medical conditions than previously described. Future etiological studies aiming to disentangle the effect of obesity in COVID-19 outcomes should have this information present and consider data-driven techniques to account for confounding, such as propensity score estimation and its adjustment methods [18].

Finally, PLWO experienced adverse events more frequently than those without obesity, particularly hospitalization and the requirement of intensive services. Certainly, our results must be interpreted carefully considering the differences in demographics and comorbidities between these groups. Interestingly, in patients hospitalized, we did not observe clear differences in fatality between patients with and without obesity. While two meta-analyses reported that obesity is associated with a higher risk of COVID-19 related mortality; other large observational studies from the US and the UK using finer categories of BMI only found an association with mortality for morbid obesity (BMIs ≥ 35 kg/m2 or ≥40 kg/m2) [7, 25, 28, 29, 34]. Given the scarcity of evidence regarding the frequency of specific adverse events during hospitalization among PLWO, our findings are of special interest to the field and should be addressed in upcoming etiological studies.

In this large international cohort, we showed that among COVID-19 cases, PLWO were more likely to be female, have more comorbidities, and worse outcomes than patients without obesity. The prevalence of obesity was higher among hospitalized patients with COVID-19 compared to patients diagnosed with COVID-19. Our results may be useful in guiding clinical practice and aid future preventative strategies for patients living with obesity, as well as providing useful data to support subsequent etiological studies focussed on obesity and COVID-19.