info@biomedres.us   +1 (720) 414-3554
  One Westbrook Corporate Center, Suite 300, Westchester, IL 60154, USA

Biomedical Journal of Scientific & Technical Research

April, 2021, Volume 35, 2, pp 27460-27468

Research Article

Research Article

Spatial Variability of COVID-19 First Wave Severity and Transmission Intensity in Spain: The Influence of Meteorological Factors

Beatriz Hervella1, M Yolanda Luna1*, Julio Díaz2, Cristina Linares2 and Fernando Belda1

Author Affiliations

1Spanish Meteorological Agency (AEMET), Madrid, Spain

2National School of Public Health, Carlos III Institute of Health (ISCIII), Madrid, Spain

Received: April 06, 2021 | Published: April 20, 2021

Corresponding author: M Yolanda Luna, Spanish Meteorological Agency (AEMET), C/Leonardo Prieto Castro 8, 28040 Madrid, Spain

DOI: 10.26717/BJSTR.2021.35.005667

Abstract

Within the same country, Spain, with the same cultural aspects and containment policies (without lockdown), why in the initial moment of the COVID-19 first wave, given a significant number of infections, the disease prospered more intensely in some areas than in others? The hypothesis is that the meteorological factors, that is, the outbreak weather conditions are relevant factors which could be used as early indicators of the COVID-19 first wave severity and transmission intensity. This paper presents a model that allows predicting COVID-19 first wave severity and transmission intensity in Spain based on early weather information. The weather explanatory variables were threshold average temperature and threshold average absolute humidity defined as daily average temperature and daily average absolute humidity averaged at the moment in which the number of infections began to grow exponentially and in its previous 13 days. Socioeconomic factors as independent variables were also employed. The used independent variables used are the maximum daily incidence rate and the incidence rate doubling speed defined as the speed at which the daily incidence rate when the number of infections begins to grow exponentially becomes double. A principal component analysis and a linear regression model approach proved the existence of correlation between the variables. Temperature is the most important driver followed by absolute humidity and the correlation found in both cases is negative. A 0.1ºC/1 g/m3 increase of threshold average temperature/absolute humidity is associated with an outbreak incidence rate doubling speed natural logarithm reduction of 0.219 and 0.193 respectively. A 0.1ºC/1 g/m3 increase of threshold average temperature/absolute humidity is associated with a maximum daily incidence rate natural logarithm reduction of 0.253 and 0.222 respectively. The results show that the virus has harder time intensifying and spreading in warmer temperature and higher absolute humidity during the first wave.

Keywords: SARS-CoV2; Incident Rate; Outbreak Incident Rate Doubling Speed; Correlation Analyses; Linear Regression

Abbreviations: WHO: World Health Organization; MIR: Maximum daily Incidence Rate; RDS: Rate Doubling Speed; AH: Absolute Humidity; GDP: Gross Domestic Product; PD: Populated Municipality; MOV: Movements; PCA: Principal Component Analysis; TT: Temperature

Introduction

All of us are immersed in one of the greatest challenges that humanity has faced in recent years. At the end of December 2019 in Wuhan, Hubei province, China, a new disease appeared to change everything. Called COVID-19 by the World Health Organization (WHO), this new respiratory infectious disease is the result of a novel coronavirus called SARS-CoV-2 that had not previously been identified in humans. On March 11th, 2020 the World Health Organization declared that COVID-19 can be characterized as a pandemic [1,2] and on March 13th, Europe was defined as the epicenter of the pandemic [3]. Spain was one of the most affected countries in the world during the COVID-19 first wave, that is, from March to June [4]. The National Center of Microbiology of Carlos III Institute of Health declared the first official COVID-19 case in Spain on January 31st in La Gomera, Canary Island [5]. At the beginning of March the situation worsened with a significant increase in infections, so a nationwide lockdown was imposed on March 14th [2]. Despite the abundance of articles that try to investigate and understand the evolutionary dynamics of the virus, there are still many unknowns. The one that has caught our attention and justifies this study is this: within the same country, Spain, with the same cultural aspects and containment policies (without lockdown), why in the initial moment of the COVID-19 first wave, given a significant number of infections, the disease prospered more intensely in some areas than in others? The proposed hypothesis is that the outbreak weather conditions are relevant factors which could be used as early indicators of the COVID-19 first wave severity and transmission intensity. This hypothesis agrees with previous studies that point out that cities with significant COVID-19 outbreaks have very similar climates pattern with relatively cool and dry environment [6,7] and other ones which show that other SARS virus outbreaks were significantly associated with the temperature and its variations [8,9].

Therefore, this study is going to focus on the initial moments of the disease spread and more specifically on the moment in which the infections acquire exponential character; it has happened in Spain in pre-lockdown conditions so without active measures of social distancing and with a minority masks use. This could reinforce the hypothesis of the importance of weather factors in this specific moment because at that time these factors have not had to compete with other more relevant factors such as policy or sanitary measures more significant once the pandemic has already started [10]. The initial hypothesis is also supported by the proven fact that the COVID-19 spread was favored by 4 causes which involve meteorology factors. Low temperatures and absolute humidity weak the immune system favoring the proliferation of infections [11,12] they also promote the persistence of the virus on surfaces and therefore its spread through fomites or direct contact [13,14] although this is now considered a minor mode [15]. Cold temperatures produce changes in habits towards less healthy routines favoring indoor places where COVID-19 transmission rates are nearly 20 times higher than outdoors [16-18]. Finally because aerosols are one of the main COVID-19 spread modes [16,19] and their dispersion in the air is affected by both variables [20].

However, the existence of a correlation between temperature and humidity with COVID-19 transmission is not yet clear [21] although most studies point towards a negative one [22,23]. Three studies carried out in specific regions of Spain obtain different results: negative correlation [24,25] and no significant association [26] between COVID-19 and temperature. For this reason, the objective of the present study is to continue analyzing if there is correlations between COVID-19 first wave transmission intensity and severity with average temperature and average absolute humidity of the early moments of the disease outbreak. The final goal is to design a model that allows us to predict the importance of a COVID-19 outbreak with early weather information. Some socioeconomics elements has been also used as controlling factors, in line with other works [27,28] in order to consolidate the results. The analysis is focused on the 50 Spanish provinces to investigate and summarize what happened in the whole territory of the country during COVID-19 first wave.

Data and Methods

The health data were extracted from the National Epidemiological Surveillance Network (RENAVE) provided by National Center of Microbiology of Carlos III Institute of Health from February 1st to May 31st, 2020 for the 50 Spanish provinces. Specifically, the used data are the provincial daily incidence rate per 100,000 inhabitants, that is, the number of new daily positive COVID-19 cases in each province divided by the population at risk of the disease and all multiplied by 100,000 inhabitants. COVID-19 positive cases were defined from the PCR test with a positive result in 99.74% of the data. The remainder was diagnosed by symptoms compatible with the disease. With these data, the Maximum daily Incidence Rate (MIR) reached in each of the 50 provinces during the first wave was calculated. This variable, which can be understand as a measure of the COVID-19 first wave severity, showed a highly differentiated spatial distribution (Figure 1). There were provinces with MIR twenty times higher than others, as Soria (127.5) in comparison with Huelva (5.4). This graph represented the starting point of the study since the variable was our first dependent variable.

Figure 1:

a) Spatial distribution and

b) Values of the Maximum daily incidence rate reached in each of the 50 Spanish provinces during the COVID-19 first wave.

The second dependent variable was the speed at which the daily incidence rate when the number of infections begins to grow exponentially becomes double. This variable, related to the COVID-19 first wave transmission intensity, was calculated it in each province (Figure 2) and it is called outbreak incidence Rate Doubling Speed (RDS). The Spanish Meteorological Agency (AEMET) provided daily weather data, daily average temperature and daily average relative humidity, in 50 weather stations considered as reference of the 50 provinces in which the study was established. The outbreak average temperature was calculated in each province; this was at the moment in which the number of infections began to grow exponentially and it was defined by the daily incidence rate per 100,000 inhabitants exceeding the value of 5 [29]. The temperature has been averaged in the 13 days prior to exceeding that threshold and in this day. In order to take into account the longest COVID-19 incubation period [30]. This averaged value is called threshold average temperature (TT) and it will be the first explanatory weather variable. The overcoming of the outbreak threshold happened at different times in each of the Spanish provinces: the earliest start was in Álava on February 28th and the latest in Murcia on March 24th, none of them exceeding March 27th (lockdown initial day plus 13 days) guaranteeing that our study was carried out in pre-lockdown conditions. The process has been repeated with daily average absolute humidity. For this purpose, the Clausius Clapeyron equation is used to calculate the daily average Absolute Humidity (AH) in g/m3 from both the daily average temperature and relative humidity values [31,32].

Figure 2:

a) Spatial distribution and

b) Values of the Outbreak incidence rate doubling speed reached in each of the 50 Spanish provinces during the COVID-19 first wave.

where T is daily average temperature in ºC and RH is daily average relative humidity in %. Once obtained, the daily average absolute humidity has been averaged for 14 days, in the same way as the daily average temperature, to end up obtaining the second independent weather variable called threshold average Absolute Humidity (AH). National Institute of Statistics has provided with various socioeconomics factors which were used as independent variables: Gross Domestic Product (GDP), Percentage of Population equal to or Older than 60 years (AGE), Population Density of most Populated Municipality (PD) and pre-lockdown intra-provincial Movements (MOV) obtained through mobile phone positioning estimates Several simple linear regression models have been constructed to explore the individual relationships established between the independent weather variables (threshold average temperature and threshold average absolute humidity) with maximum daily incidence rate and outbreak incidence rate doubling speed. The correlation coefficient gave a measure of the linear association obtained. Subsequently, a multiple linear regression models have been elaborated in which all the factors, both meteorological and socioeconomic, are incorporated: A backward technique was applied, in which all the variables were initially incorporated and regressors were progressively eliminated from lower to higher contribution until it was significant enough not to be eliminated. In order to corroborate multiple linear regression findings, a Principal Component Analysis (PCA) was done. This analysis enables to control and avoid multicollinearity of the predictors and to drop our least important variables. Finally, a linear regression models of the dependent variables against the reduced set of principal components was done in order to obtain de best and stable final model. All data have been analyzed using the statistical program Statgraphics©.

Results

Maximum Daily Incidence Rate

Simple Linear Regression Model: Table 1 shows the regression and correlation coefficients of the two proposed simple linear regression models with threshold average temperature as explanatory variable and two targets: the maximum daily incidence rate and its natural logarithm. Both models show a very strong negative correlation, that is, higher threshold average temperature is associated with lower maximum daily incidence rate and vice versa. The best option is the one that results after applying the natural logarithm (Figure 3a); this model explains 65.58% of the maximum daily incidence rate variability. Threshold average temperature above 13.02º C (99% CI, 14.82º to 11.23º C) are associated with low maximum daily incidence rate of the pandemic, that is with maximum daily incidence rate lower than 20. The model with maximum daily incidence rate natural logarithm (Table 1) was used to determine this threshold value. We repeat the same analysis with threshold average absolute humidity as explanatory variable. The best model, the natural logarithm of the maximum daily incidence rate (Figure 3b), explains 55.53% of its variability (Table 1). A strong negative correlation is reestablished between the variables. Threshold average absolute humidity above 7.37 g / m3 (99% CI, 8.33 to 6.41 g / m3) are associated with low maximum daily incidence rate of the pandemic, that is with maximum daily incidence rate lower than 20. The model with maximum daily incidence rate natural logarithm (Table 1) was used to determine this threshold value.

Table 1: Regression coefficients and results of Simple Linear Regression models (99% confidence interval, CI) with threshold average temperature (TT) and with threshold average absolute humidity (AH) as explanatory variables, and maximum daily incidence rate (MIR) and its natural logarithm as targets. B0 is the slope and B1 is the intercept.

Figure 3: Regression line (blue), prediction interval (purple) and confidence interval (red) of Maximum daily incidence rate as a function of

a) Threshold average temperature (in 0,1ºC) and

b) Threshold average absolute humidity (in g/m3) and of the Outbreak incidence Rate Doubling Speed as a function of

c) Threshold average temperature and

d) Threshold average absolute humidity.

Multiple Linear Regression Model: In order to confirm the results obtained and to refine them, a multiple linear regression model in which we incorporated all the variables is built. The dependent variable is the natural logarithm of the maximum daily incidence rate. First thing to note is that none of the coefficients for the provincial socioeconomic variables were statistically significant; so these variables, GDP, % of population equal to or older than 60 years, population density of most populated municipality and prelockdown intra-provincial movements, were excluded for the final model. A second result is the confirmation of a negative correlation between the maximum daily incidence rate and threshold average temperature and absolute humidity, the only two variables that the final model contains. This model explains 68.72% of the maximum daily incidence rate variation with a confidence level of 99% since the p-value returned by ANOVA is less than 0.01. Table 2 contains the regression coefficients of the obtained model. The model equation obtained is:

Table 2: Regression coefficients of a Multiple Linear Regression model with threshold average temperature (TT) and threshold average absolute humidity (AH) as explanatory variables and natural logarithm of maximum daily incidence rate as target (MIR). The equation is Ln (MIR) = B0 + B1 TT + B2 AH.

where MIR is maximum daily incidence rate, TT is threshold average temperature in 0,1ºCand TAH is threshold average absolute humidity in g / m3. The model obtained represents an advance with respect to the simple linear regression models proposal and it allows to anticipate, with moderate efficiency, one outbreak severity.

Principal Component Analysis: In order to analyze if there is multicollinearity between the predictor variables to confirm the validity of the model found or to look for an alternative one The correlation matrix between the model variables (Table 3) clearly indicates the existence of multicollinearity (correlation values greater than 0.5). PCA indicates that there are two significant components (eigenvalue greater that 1) and these components explain 72.54% of the variability of the data (Table 4). In Figure 4 the spatial distribution of two principal components is displayed. In the first component (Figure 4a), it can be observed two areas of different behavior, the north and center of the country characterized by low temperature and absolute humidity and the south characterized by higher temperature and absolute humidity. This component doesn’t indicate any characteristic related with socioeconomic variables, meanwhile the second component (Figure 4b) is related with them showing different behavior in populated areas in the center (Madrid and its influence’s area), north east (including Barcelona and its influence’s area, Valencia and Balearic Islands), north (Guipuzcoa y Bizcaya), and two little centers around Sevilla and A Coruña. This explanation about the components can be corroborated by the coefficients of the equations that define the first and second principal components (Table 5). Here it is important to highlight that the components are mathematically orthogonal, so the correlation between them are zero, that is, they are absolutely independent. PCA demonstrate that meteorological variables are almost independent of the socioeconomic variables. Finally, the set of 6 initial variables are reduced to 2: the first component is the one in which the meteorological factor has the greatest weight and divides the peninsula into two differentiated areas, as temperature is the most relevant factor, and the second one that support the socio-economic factors and therefore, it has greater weight in large cities. We explore the relationships with these two new explanatory variables with our target. Results indicate that there is a moderately strong correlation with first principal component and there is no evidence of relation with second principal component (Table 6). Our final model, which corrects the effect of multicollinearity and explains 60.72% of the maximum daily incidence rate variability, is:

Table 3: Correlation matrix with the correlation coefficients for our model independent variables.

Table 4: Eigenvalues with variance percentage and accumulated percentage associated.

Table 5: Coefficients of the equations that define the first and second principal components.

Table 6: Regression coefficients and results of Simple Linear Regression models (99% confidence interval) with maximum daily incident rate, MIR, as target and the first component (PCA 1) and the second component (PCA 2) as explanatory variables.

Figure 4: Spatial distribution of the principal components extracted in the principal components analysis:

a) First one,

b) Second one.

Where MIR is maximum daily incidence rate, PD is population density of most populated municipality, MOV is pre-lockdown intraprovincial movements, AGE is % of population equal to or older than 60 years, TT is threshold average temperature in 0,1ºC and TAH is threshold average absolute humidity in g / m3. Most weight factors are threshold average temperature and threshold average absolute humidity and those with the least are population density of most populated municipality and pre-lockdown intra-provincial movements. A0.1ºC increase of threshold average temperature is associated with a maximum daily incidence rate natural logarithm reduction of 0.253 (99% CI, 0.301 to 0.205). A 1 g/m3 rise in threshold average absolute humidity is related with a maximum daily incidence rate natural logarithm reduction of 0.222 (99% CI, 0.270 to 0.174).

Outbreak Incidence Rate Doubling Speed

Simple Linear Regression Model: Table 7 shows the regression and correlation coefficients obtained with threshold average temperature (TT) as explanatory variable and outbreak incidence Rate Doubling Speed (RDS) and its natural logarithm as targets. The best model (Figure 4a), the one that results after applying natural logarithm to the outbreak incidence rate doubling speed, explains 57.35% of its variability. It indicates a strong negative correlation between variables. Threshold average temperature above 13.69 ºC (99% CI, 15.63º to 11.75ºC), are associated with low outbreak incidence rate doubling speed, that is with low outbreak incidence rate doubling speed equal or lower than 1. The model with outbreak incidence rate doubling speed natural logarithm (Table 7) was used to determine this threshold value. When threshold average absolute humidity is employed as explanatory variable, the best model (Figure 4b) is the one that results after applying the natural logarithm to the outbreak incidence rate doubling speed: explains 32.85% of its variability (Table 7) and a moderately strong negative correlation between the variables. It is important to note that this is the model with lowest correlation index obtained.

Table 7: Regression coefficients and results of Simple Linear Regression models (99% confidence interval, CI) with threshold average temperature (TT) and with threshold average absolute humidity (AH) as explanatory variables, and outbreak incidence rate doubling speed (RDS) and its natural logarithm as targets. B0 is the slope and B1 is the intercept.

Principal Components Regression: As it was demonstrated, there is multicollinearity so the relationship was established between our target, that is, outbreak incidence rate doubling speed, and the two principal components through a simple linear regression model. Results obtained were very similar as the previous case: there was a moderately strong correlation with first principal component and no relation was found with second principal component (Table 8). Our final model, which corrects the multicollinearity effect and explains 52.55% of outbreak incidence rate doubling speed variability is:

Table 8: Regression coefficients and results of Simple Linear Regression models (99% confidence interval) with outbreak incidence rate doubling speed (RDS) as target and the first component (PCA 1) and the second component (PCA 2) as explanatory variables.

where RDS is outbreak incidence rate doubling speed, PD is population density of most populated municipality, MOV is prelockdown intra-provincial movements, AGE is % of population equal to or older than 60 years, TT is threshold average temperature in 0,1ºC and TAH is threshold average absolute humidity in g / m3. A 0.1ºC increase of threshold average temperature is associated with an outbreak incidence rate doubling speed natural logarithm reduction of 0.219 (99% CI, 0.249 to 0.190). A 1 g/m3 rise in threshold average absolute humidity is related with an outbreak incidence rate doubling speed natural logarithm reduction of 0.193 (99% CI, 0.219 to 0.167).

Conclusion

A statistical analysis to evaluate if outbreak average temperature and average absolute humidity could be use as early indicators of severity and transmission intensity of COVID-19 first wave in Spain has been presented in this work. The existence of correlation between two dependent variables and both meteorological and economic factors has been confirmed. Nevertheless, socioeconomic factors employed are less important than weather factors, particularly population density of most populated municipality and pre-lockdown intra provincial movements. Temperature is the most important driver followed by absolute humidity and the correlation found in both cases is negative. A 0.1ºC / 1 g/m3 increase of threshold average temperature / absolute humidity is associated with an outbreak incidence rate doubling speed natural logarithm reduction of 0.219 (99% CI, 0.249 to 0.190) and 0.193 (99% CI, 0.219 to 0.167) respectively. A 0.1ºC / 1 g/m3 increase of threshold average temperature/absolute humidity is associated with a maximum daily incidence rate natural logarithm reduction of 0.253 (99% CI, 0.301 to 0.205) and 0.222 (99% CI, 0.270 to 0.174) respectively. Correlations obtained are in agreement with the majority of studies carried out [33]. Correlation does not imply causality but there is some evidence that in Spain the virus has harder time intensifying and spreading in warmer temperature and higher absolute humidity during the first wave. These results could also suggest a possible seasonal pattern of the COVID-19 disease. This is the first work presenting a model that allows predicting COVID-19 first wave severity and transmission intensity in the whole country, Spain, based on early average temperature and absolute humidity; but this study does not imply that these variables were a primary driver of COVID-19 transmission; more factors must be analyzed. This methodology can be extrapolated to other mid-latitude countries and will serve to show why cert areas compared to others have had more intense Covid-19 first wave episodes. The model obtained could be used as an useful supplement to help authorities to act quickly taking preventive measures and defining theirs COVID-19 combat strategy but its use is limited to future situations in which meteorological factors become relevant again [34,35] that is, when the current political and social restriction and health measures disappear when the disease becomes endemic and shows clearly its seasonal pattern.

Acknowledgement

The authors gratefully acknowledge Project ENPY 221/20 grant from the Carlos III Institute of Health. The authors also wish to thank the Spanish Meteorological Agency and the Spanish Health Ministry for providing the datasets.

Disclaimer

The researchers declare that they have no conflict of interest that would compromise the independence of this research work. The views expressed by the authors do not necessarily coincide with those of the institutions they are affiliated with.

References

Research Article

Spatial Variability of COVID-19 First Wave Severity and Transmission Intensity in Spain: The Influence of Meteorological Factors

Beatriz Hervella1, M Yolanda Luna1*, Julio Díaz2, Cristina Linares2 and Fernando Belda1

Author Affiliations

1Spanish Meteorological Agency (AEMET), Madrid, Spain

2National School of Public Health, Carlos III Institute of Health (ISCIII), Madrid, Spain

Received: April 06, 2021 | Published: April 20, 2021

Corresponding author: M Yolanda Luna, Spanish Meteorological Agency (AEMET), C/Leonardo Prieto Castro 8, 28040 Madrid, Spain

DOI: 10.26717/BJSTR.2021.35.005667

Abstract

Within the same country, Spain, with the same cultural aspects and containment policies (without lockdown), why in the initial moment of the COVID-19 first wave, given a significant number of infections, the disease prospered more intensely in some areas than in others? The hypothesis is that the meteorological factors, that is, the outbreak weather conditions are relevant factors which could be used as early indicators of the COVID-19 first wave severity and transmission intensity. This paper presents a model that allows predicting COVID-19 first wave severity and transmission intensity in Spain based on early weather information. The weather explanatory variables were threshold average temperature and threshold average absolute humidity defined as daily average temperature and daily average absolute humidity averaged at the moment in which the number of infections began to grow exponentially and in its previous 13 days. Socioeconomic factors as independent variables were also employed. The used independent variables used are the maximum daily incidence rate and the incidence rate doubling speed defined as the speed at which the daily incidence rate when the number of infections begins to grow exponentially becomes double. A principal component analysis and a linear regression model approach proved the existence of correlation between the variables. Temperature is the most important driver followed by absolute humidity and the correlation found in both cases is negative. A 0.1ºC/1 g/m3 increase of threshold average temperature/absolute humidity is associated with an outbreak incidence rate doubling speed natural logarithm reduction of 0.219 and 0.193 respectively. A 0.1ºC/1 g/m3 increase of threshold average temperature/absolute humidity is associated with a maximum daily incidence rate natural logarithm reduction of 0.253 and 0.222 respectively. The results show that the virus has harder time intensifying and spreading in warmer temperature and higher absolute humidity during the first wave.

Keywords: SARS-CoV2; Incident Rate; Outbreak Incident Rate Doubling Speed; Correlation Analyses; Linear Regression

Abbreviations: WHO: World Health Organization; MIR: Maximum daily Incidence Rate; RDS: Rate Doubling Speed; AH: Absolute Humidity; GDP: Gross Domestic Product; PD: Populated Municipality; MOV: Movements; PCA: Principal Component Analysis; TT: Temperature