1,154
309
14
1
Medicine
Public Health, Environmental and Occupational Health
Background
As COVID-19 vaccines develop, methods for identifying vulnerability within groups for prioritized vaccination remain unestablished. This paper presents a novel approach based on population-based analysis of viral pneumonia vulnerability as an example.
Methods
The analysis employed an anonymous, 16-year, population dataset (n = 768,460) consisting of International Classification of Diseases (ICD-9) diagnoses, demographics, and dates identifying those with viral pneumonia and permitting linkage of these individuals to all their associated diagnoses for the calculation of odds ratios and proportions of disorders before and after the index viral pneumonia diagnosis.
Results
Females and males had results of differing magnitude. For those with viral pneumonia, the mean number of diagnoses was greater in both the subsample and the whole sample, with associated diagnoses arising about 4 years on average before the viral pneumonia index diagnosis. Within the subsample, compared to those without, the temporal analysis revealed distinct over-representation for those with viral pneumonia at visit one and over the first fifty visits. Further, those with viral pneumonia had diagnoses not represented in the group without viral pneumonia.
Conclusions
The population-based analysis of temporal hyper-morbidity may be a viable and economical approach to identifying viral pneumonia vulnerability. The approach presented in this paper may provide an economical means of identifying vulnerability to COVID-19 in regions where comparable data are available for analysis. Rational approaches may optimize vaccination and help to limit the spread of the disease and to some extent alleviate the health service burden.
Corresponding author: David Cawthorpe, cawthord@ucalgary.ca
With the advent of COVID-19 vaccines and given a relatively limited initial supply, there comes the requirement for decisions regarding who should first receive vaccination to best curb the spread of the pandemic. Vulnerable groups are identified based on infection rates and mortality, such as healthcare workers and the elderly. Yet, it is within groups across populations that there are presently no clearly defined criteria to serve as a basis on which to optimally ration the vaccine. This paper presents for consideration one possible set of criteria not yet described in the literature.
Most models predicting viral pneumonia tend to focus on surveillance data.[1][2][3][4][5] Others tend to focus on aspects of the virus species[6][7] or variants within species.[8] This paper presents a novel model based on the analysis of diagnosis data readily available in many healthcare catchments. The model derives from the population-based examination of temporal hyper-morbidity. The approach to morbidity analysis is an emerging field of investigation.[9] For example, the World Psychiatry Association has recently established a comorbidity section based, in part, on work describing the population-based, temporal, hyper-morbidity of psychiatric disorder in relationship to other disorders, such as cancer and ulcerative colitis.[10][11] This analysis employed a similar standardized, population-based approach to model development. The present model describes viral pneumonia-associated morbidity and provides an example of the temporal hyper-morbidity for all observed disorders in a subsample of those aged less than one year who subsequently did or did not develop viral pneumonia.
In Alberta, Canada, all physicians must bill the provincial government for reimbursement of each patient visit and record at minimum a patient identifier, date of birth, a diagnosis, and visit date. This study employed an anonymous 16-year population dataset (April 1993-November 2010) consisting of International Classification of Disease (ICD version 9) diagnoses, age, sex, and visit date for all health-seeking individuals in the Calgary Health Zone, Alberta, Canada (ethics IDREB15-1057). The sample (Table 1) was stratified on the basis of sex, age (overall and those less than one year of age), and grouped on the presence (+) or absence (-) of any viral pneumonia (VP). Further, the encrypted unique identifier permitted linking as a group all diagnoses of those with viral pneumonia (VP+).
To examine the overall relationship of the presence or absence of viral pneumonia (VP+ or VP-) and all other main classes of ICD-9 diagnoses in the study population, odds ratios and upper and lower 95% confidence intervals (CI) were calculated separately for females and males (Tables 2 a and b).
To provide an example for representation, a subsample of those under the age of one year was isolated (Table 1). This subsample was ordered by the date of each visit and diagnosis, with the frequency of each diagnosis tallied by order of visit. Figures 1 a and b show visit #1 for each sex, respectively, the ratio of the total frequency for each diagnosis divided by the sample sizes (Table 1) for those with (numerator: VP+) and without (denominator: VP-) viral pneumonia (VP). The horizontal y-line at the value one demarcates the proportions, with values greater than one indicating a greater proportional frequency of all ICD diagnoses in the numerator (VP+). Figures 2 a and b show the same calculation in three dimensions for the first 50 visits of those under the age of one year. The x-z lines at values of one similarly demarcate equal ratios of proportions of diagnoses between the groups, with greater frequency proportions within the linked VP+ diagnoses. Figures 2 a and b truncate at the value 4 for ease of peak and trough comparison, noting that the plateaus signify diagnoses where the [VP+/VP-] ratio is greater than the value 4, with the full distributions described in the text. Note that V codes were sequentially ordered in the 1200 range with the value 12 replacing the letter V. Further, visits without diagnoses representing pathology, laboratory, or procedures were coded with the value 1300 for graphical representation, with the total noted in the subscript (*) of Table 1.
Table 1 describes, for males and females of all ages and for those less than one year of age, the counts of unique VP+ and VP- individuals and linked diagnosis frequencies, as well as means with standard deviations (SD).
Groups | Female | Male | |
---|---|---|---|
All Ages | |||
VP- | UID | 304505 | 267498 |
Diagnoses* | 36510566 | 20255378 | |
mean (SD) | 120 (125) | 76 (99) | |
VP+ | UID | 111821 | 84636 |
Diagnoses* | 25082428 | 13998139 | |
mean (SD) | 224 (196) | 165 (181) | |
< 1 Year | |||
VP- | UID | 12775 | 20326 |
Diagnoses* | 139689 | 249063 | |
mean (SD) | 11 (9) | 13 (10) | |
VP+ | UID | 5125 | 8867 |
Diagnoses* | 69178 | 138234 | |
mean (SD) | 14 (12) | 16 (19) |
* Included 11,629,494 unspecified non-diagnosis counts for Pathology/Laboratory/Procedures
Table 2a and 2b show, by sex, the cell sizes (a, b, c, d) and odds ratio (OR), along with the upper and lower 95% confidence intervals for each ICD main class in descending order. In each ICD-9 main class, the lower 95% confidence interval is greater than the value one, indicating over-representation of VP+-linked disorders within ICD classes for each sex. The lower limit of the 95% confidence intervals was greater in males than in females in the following main ICD classes: neoplasms, blood/blood organs, complications of pregnancy, congenital anomalies, perinatal, and HIV. The upper limit of the 95% confidence intervals in males was less than that of females in the following main ICD classes: endocrine, etc., mental disorders, nervous system/sense organs, digestive system, skin and subcutaneous tissue, musculoskeletal system connective tissue, ill-defined conditions, injury and poisoning, V codes, mental disorder combined with relevant V codes, and other respiratory diseases. The overall order of main ICD classes was similar for females and males, with the greatest magnitude being in males for diseases and disorders of blood/blood organs, and with males having had greater complications of birth and pregnancy (e.g., fetal distress).
ICD Main Class | a | b | c | d | OR* | Lower 95%CI | Upper 95%CI |
---|---|---|---|---|---|---|---|
Ill Defined Conditions | 33209 | 271296 | 2175 | 109646 | 6.17 | 5.91 | 6.45 |
Respiratory System | 58683 | 245822 | 4600 | 107221 | 5.56 | 5.4 | 5.74 |
Other Respiratory Diseases | 65105 | 239400 | 6778 | 105043 | 4.21 | 4.11 | 4.33 |
Nervous System/Sense Organs | 90345 | 214160 | 12098 | 99723 | 3.48 | 3.41 | 3.55 |
Injury And Poisoning | 78576 | 225929 | 10508 | 101313 | 3.35 | 3.28 | 3.43 |
V Codes | 31778 | 272727 | 3794 | 108027 | 3.32 | 3.21 | 3.43 |
Skin And Subcutaneous Tissue | 93041 | 211464 | 15042 | 96779 | 2.83 | 2.78 | 2.88 |
Musculoskeletal System Connective Tissue | 93513 | 210992 | 15388 | 96433 | 2.78 | 2.73 | 2.83 |
Digestive System | 163436 | 141069 | 34957 | 76864 | 2.55 | 2.51 | 2.58 |
Infectious/Parasitic | 131878 | 172627 | 26536 | 85285 | 2.46 | 2.42 | 2.49 |
Mental Disorder with associated V Codes | 118534 | 185971 | 23488 | 88333 | 2.4 | 2.36 | 2.44 |
Mental Disorders | 124260 | 180245 | 25358 | 86463 | 2.35 | 2.31 | 2.39 |
Circulatory System | 189254 | 115251 | 48397 | 63424 | 2.15 | 2.12 | 2.18 |
Genitourinary System | 75375 | 229130 | 14783 | 97038 | 2.16 | 2.12 | 2.2 |
Endocrine, Nutritional, Metabolic Immune System | 192767 | 111738 | 50810 | 61011 | 2.07 | 2.04 | 2.1 |
Blood/Blood Organs | 258772 | 45733 | 82741 | 29080 | 1.99 | 1.96 | 2.02 |
HIV | 198777 | 105728 | 58261 | 53560 | 1.73 | 1.7 | 1.75 |
Neoplasms | 193759 | 110746 | 57012 | 54809 | 1.68 | 1.66 | 1.71 |
Congenital Anomalies | 287414 | 17091 | 102593 | 9228 | 1.51 | 1.47 | 1.55 |
Perinatal | 283998 | 20507 | 102294 | 9527 | 1.29 | 1.26 | 1.32 |
Complications of Pregnancy | 224084 | 80421 | 80766 | 31055 | 1.07 | 1.06 | 1.09 |
*Odds Ratio = [(a*d)/(c*b)]
ICD Main Class | a | b | c | d | Odds Ratio | Odds Ratio | Lower 95%CI |
---|---|---|---|---|---|---|---|
Respiratory System | 68105 | 199393 | 5135 | 79501 | 5.29 | 5.13 | 5.45 |
Ill Defined Conditions | 45616 | 221882 | 3479 | 81157 | 4.8 | 4.63 | 4.97 |
Other Respiratory Diseases | 75742 | 191756 | 7657 | 76979 | 3.97 | 3.87 | 4.07 |
Nervous System/Sense Organs | 96971 | 170527 | 13395 | 71241 | 3.02 | 2.96 | 3.09 |
V Codes | 65758 | 201740 | 8531 | 76105 | 2.91 | 2.84 | 2.98 |
Injury and Poisoning | 61682 | 205816 | 8410 | 76226 | 2.72 | 2.65 | 2.78 |
Blood/Blood Organs | 248851 | 18647 | 70507 | 14129 | 2.67 | 2.61 | 2.74 |
Skin and Subcutaneous Tissue | 103086 | 164412 | 16386 | 68250 | 2.61 | 2.56 | 2.66 |
Infectious/Parasitic | 135413 | 132085 | 24452 | 60184 | 2.52 | 2.48 | 2.57 |
Digestive System | 155592 | 111906 | 30573 | 54063 | 2.46 | 2.42 | 2.5 |
Musculoskeletal System Connective Tissue | 97838 | 169660 | 17146 | 67490 | 2.27 | 2.23 | 2.31 |
Mental Disorder with Associated V codes | 138296 | 129202 | 27905 | 56731 | 2.18 | 2.14 | 2.21 |
Mental Disorders | 143604 | 123894 | 29660 | 54976 | 2.15 | 2.11 | 2.18 |
Genitourinary System | 186020 | 81478 | 43818 | 40818 | 2.13 | 2.09 | 2.16 |
Circulatory System | 183415 | 84083 | 43226 | 41410 | 2.09 | 2.06 | 2.12 |
Endocrine, Nutritional, Metabolic Immune System | 188328 | 79170 | 46009 | 38627 | 2 | 1.97 | 2.03 |
Complications of Pregnancy | 264140 | 3358 | 82511 | 2125 | 2.03 | 1.92 | 2.14 |
HIV | 187453 | 80045 | 47525 | 37111 | 1.83 | 1.8 | 1.86 |
Neoplasms | 202394 | 65104 | 53390 | 31246 | 1.82 | 1.79 | 1.85 |
Perinatal | 259121 | 8377 | 79994 | 4642 | 1.79 | 1.73 | 1.86 |
Congenital Anomalies | 255067 | 12431 | 78360 | 6276 | 1.64 | 1.59 | 1.7 |
Figure 1 shows for females (upper) and males (lower) the visit #1 [(VP+/VP-] ratios of sample proportions for those with and without VP for each diagnosis. The ratios of diagnoses with over-representation within the linked diagnoses of the VP+ group are given when the dropline ends above the horizontal y line (value one).
There were ICD diagnoses for which the VP+ group was assigned a diagnosis on visit #1 but for which there was no corresponding VP- diagnosis on which to base a comparison. Ratios were not calculated for ICD-9 diagnoses not represented in either the VP+ or VP- groups. The female VP- group for visit #1 contained 49 diagnoses not assigned in the VP+ group (not listed). Similarly, the female VP+ group contained 10 diagnoses not assigned in the VP- group: The ICD codes unique to the VP+ group were as follows: 345, 385, 426, 629, 705, 772, 793, 882, 953, and V code 5.
The male VP- group for visit #1 contained 130 diagnoses not assigned in the VP+ group (not listed). The male VP+ group contained 36 diagnoses not assigned in the VP- group: The ICD codes unique to the VP+ group were as follows: 1, 41, 190, 210, 230, 239, 259, 308, 323, 335, 345, 365, 377, 385, 426, 432, 455, 514, 523, 537, 579, 629, 653, 668, 698, 705, 772, 793, 820, 882, 945, 953, 999, V code 3, V code 5, and V code 50.
Figure 2a and 2b show, for each sex, the three-dimensional variations in the ratios over time for all [VP+/VP-] linked diagnosis frequencies represented in temporal order by all ICD diagnoses for the first 50 visits for those under the age of one year. The Figure 2 graphics extend the visit #1 ratios shown in Figure 1 across the first 50 visits for the less than one year age group (e.g., no visits for those aged 1 or older are included). As noted in the methods section, the upper limit of the ratio for each diagnosis is truncated at the value four for ease of comparison. The visits were truncated at 50 for ease of visualizing variations across the range of represented ICD diagnoses.
The means, standard deviations (SD), and ranges of the full distributions of diagnosis ratios are described as follows. In the female group with age less than one year, there were 175 ICD diagnoses on average 3.97 (SD 3.65) years before VP+ where the ratio was greater than the value one with a mean of 1.43 (SD 1.31) and an upper range limit of 16, with 70 ICD diagnoses where the ratio was greater than the value 1.25 with a mean of 1.91 (SD 2.0). For females of all ages, ICD disorders preceded VP+ on average by 4.26 (SD 3.60) years.
In the male group with age less than one year, there were 310 ICD diagnoses on average 3.76 (SD 3.56) years before VP+ where the ratio was greater than the value one with a mean of 1.4 (SD 0.75) and an upper range limit of 9, with 166 ICD diagnoses where the ratio was greater than the value 1.25 with a mean of 1.74 (SD 0.93). For all ages, ICD disorders preceded VP+ for males on average by 4.19 (SD 3.58) years.
To summarize, not surprisingly, compared to those without, those with viral pneumonia have greater morbidity on average at all ages and when only those under the age of one year are considered (Table 1). There was an overall relationship between viral pneumonia (compared to those without) and the main ICD classes in the study population based on the odds ratio calculations shown in tables 2a and 2b. The odds ratio analysis was based on counts of individuals and did not provide information about the frequency (intensity) of the unique diagnoses within individuals making up each group. For both age groups, the ICD-9 diagnoses arising before the index viral pneumonia diagnosis did so by about 4 years on average. As such, the profile of diagnoses in the less than one year of age group would not on average have developed viral pneumonia until age four to five.
Examining the sequential diagnoses of those less than one year of age provided evidence of distinct profiles of diagnosis intensity, comparing those with and without viral pneumonia when diagnoses were represented in both groups. When diagnoses were not represented in both groups, the analysis identified additional diagnoses unique to those with and without viral pneumonia.
The population-based comparative analysis demonstrated an overall relationship of viral pneumonia across all ICD classes of disease. Analysis of the temporal order revealed an age-specific range of ICD diagnoses distinguishing those who subsequently developed viral pneumonia. The temporal analysis focused on ICD diagnoses preceding viral pneumonia, identifying the ICD diagnosis profiles of those vulnerable to viral pneumonia. This vulnerability, exemplified in a subsample less than one year of age on visit one (diagnosis) and over the first 50 visits (diagnoses), likely persists across the full range of age groups in the dataset given the greater average frequency of diagnoses in the VP+ group and the consistent period of time that associated ICD disorders precede viral pneumonia in the whole sample.
Where similar data exist at local, regional, and national levels, applying this population-based comparative analysis of temporal hyper-morbidity approach provides a standardized means of identifying those who have to date become infected with COVID-19. Identifying this vulnerability in those not yet infected may assist in prioritizing the most vulnerable for vaccination; an action that may more rapidly alleviate the social burden brought on by the COVID-19 pandemic.
There were several limitations in the present study. Diagnostic precision is a limitation, as not all physicians have the same level of practice competence with respect to the diagnostic formulation. For brevity, the subsample analysis focused on the ICD diagnoses preceding viral pneumonia. Focusing on associated hyper-morbidity following the index viral pneumonia may assist health systems in planning for increased demand for specific service types typical of this viral pneumonia. Lastly, while the sequential order of the first fifty visits was described, this order did not take into account the conditional orders of diagnoses within sets of visits (e.g., given first diagnosis X and second diagnosis Y, etc.). Compared to past analyses, the present analysis somewhat advances the method in terms of representing sequential diagnoses related to visits in time. However, taking into account the varying conditional orders of the temporal order of diagnoses leading to viral pneumonia within clusters of individuals may reveal more specific profiles of vulnerability with much greater precision.
In conclusion, there are many pathways of vulnerability to human viral pneumonia in addition to simple transmission from a vector. Human genetic vulnerability[12], while presently rare, is one pathway. Discovering and understanding the complex relationships within and between fields of genetics and metabolomics research hold great potential to prevent and cure, as these relate to viral infection in general and COVID-19 infection specifically.[13][14] However, much integration of existing knowledge into practical application from across these fields of study into health care remains to be accomplished, even in respect to the 1918 influenza pandemic.[15] The present work presents COVID-19-relevant information structured in a rapidly reproducible and universally applicable form to potentially help to curb the COVID-19 pandemic.
This study, involving analysis of anonymous administrative health data, was reviewed and approved by the University of Calgary Conjoint Health Research Ethics Board (CHREB), ID REB15-1057. As the study utilized fully anonymized, pre-existing administrative data, the requirement for individual informed consent was waived by the ethics committee.
The administrative health dataset analyzed for this study was obtained from Alberta Health Services under specific data sharing agreements. Due to privacy regulations and the terms of the data sharing agreement, the raw dataset cannot be made publicly available. Inquiries regarding data access should be directed to the relevant provincial health authority or data custodian.
David Cawthorpe conceptualized, developed, and executed the study and wrote this paper.
No conflicts to declare.
Unfunded.