Introduction

COVID-19 is an infectious disease caused by SARS-CoV-2.1 About 80% of people infected with SARS-CoV-2 experience mild to moderate respiratory illness, with fever, cough, and dyspnea as the most common symptoms.1,2 However, growing evidence indicates also significant nonrespiratory complications of COVID-19 (so called “post-COVID syndrome”).3,4 The transmission of COVID-19 occurs mainly through the respiratory route after an infected person coughs, sneezes, sings, talks, or breathes.5 Transmission through aerosols and an indirect transmission through contaminated fomites were also documented.6 The risk of COVID-19 transmission is especially high in indoor, poorly ventilated, or crowded environments.6 Social distancing, hand hygiene, wearing of a facemask, and improving airflow and ventilation in the living space are the basic preventive measures for limiting COVID-19 transmission.7 Moreover, the transmission dynamics of the COVID-19 pandemic can be slowed down by reducing individual mobility behavior as well as the number of social contacts.5-7

An epidemiologic study reported that there had been 3 waves of the COVID-19 pandemic in Poland by the end of April 2021.8,9 The peak of the first wave is dated June 8, 2020, with a documented number of 599 laboratory-confirmed COVID-19 cases. The peak of the second wave of infections, with 27 875 laboratory-confirmed cases, was reported on November 7, 2020.9 The third wave started in March 2021, with a peak of 34 843 laboratory-confirmed cases reported on April 1, 2021,9,10 making it the most severe wave of the pandemic to date. In the summer of 2020, a relatively low number of daily COVID-19 cases was observed. Between July and August 2020, despite high mobility of people and numerous interpersonal contacts, an average of only 530 new laboratory-confirmed cases of COVID-19 were recorded daily.9 However, the reason for this low incidence has not been investigated so far.

Epidemiologic data on the COVID-19 pandemic in Poland indicate significant regional differences in the number of laboratory-confirmed COVID-19 cases and COVID-19–related deaths. Within a year of reporting the first case of COVID-19, Warmińsko-Mazurskie and Kujawsko-Pomorskie voivodeships became regions with the highest number of new COVID-19 cases and related death rate per 100 000 residents. On the other hand, the lowest number of COVID-19 cases per 100 000 inhabitants was reported in the south-eastern regions of Poland, namely, in Świętokrzyskie, Podkarpackie, and Małopolskie voivodeships.10 It was suggested that regional variations in the dynamics of COVID-19 transmission may be due to differences in population densities, urbanization and industrialization levels, as well as intensity of tourist traffic between voivodeships.2,5,6 Pandemic measures aimed at containing the spread of COVID-19 implemented in individual voivodeships also influence the transmission dynamics.6-8 The same pandemic measures were in force across the country from March to July, 2020. In August 8, 2020, Poland was divided into green, yellow, and red zones, and the regionalization was maintained until October 23, 2020. Thus, regional variations in the pandemic measures as well as the level of compliance with these measures in different regions also may have influenced the dynamics of COVID-19 transmission across Poland. Finally, a growing number of reports emphasizes the role of environmental factors, such as meteorologic conditions and air pollution, in the spread of the COVID-19 pandemic.11,12

Previous studies indicated that the impact of meteorologic conditions on the dynamics of COVID-19 transmission may directly affect the viability of SARS-CoV-213 as well as indirectly contribute to changes in social behavior (eg, staying outside, airing the premises, increasing or decreasing social interactions).14 In our previous study, we demonstrated that, at the national level, several meteorologic parameters are associated with increased COVID-19 transmission in Poland, including low temperature, limited sunshine duration, and high relative humidity.15 Considering the differences in meteorologic conditions between the northern and southern parts of Poland (eg, a high temperature amplitude between Dolnośląskie and Podlaskie voivodeships), it can be hypothesized that the impact of these conditions on the dynamics of COVID-19 transmission varies between individual voivodeships. A better understanding of these relationships might considerably facilitate further development of pandemic forecasting models by national and international scientific institutions.15,16 Therefore, in the current study, we aimed to assess the correlation between 6 different meteorologic parameters and the dynamics of COVID-19 transmission in 16 Polish voivodeships. Moreover, we examined the possibility of using meteorologic parameters to forecast the development of the COVID-19 pandemic in Poland.

Patients and methods

Data on the COVID-19 pandemic in Poland

Data on the daily number of laboratory-confirmed COVID-19 cases (total and per voivodeship) as well as related data on the daily number of hospitalizations due to COVID-19 in Poland were collected from epidemiologic reports published on the official website of the Polish Ministry of Health.10 The definition of a COVID-19 case for surveillance in Poland was in line with the definition proposed by the European Centre for Disease Prevention and Control.17 In this study, only epidemiologic data from April to the end of October 2020 were included. This time limit was chosen due to the fact that by October 31, 2020, the detection of the unique sequences of SARS-CoV-2 RNA by real-time reverse transcription polymerase chain reaction was the only criterion for determining a laboratory-confirmed COVID-19 case. In November 2020, rapid antigen tests were approved for public use (initially in emergency departments), which significantly improved the diagnosis of SARS-CoV-2 infections. The shortening of the time from the collection of nasopharyngeal or oropharyngeal swabs to obtaining and reporting the results of rapid antigen tests might have significantly impacted the statistics relating to the dynamics of COVID-19 transmission. Therefore, data reported after October 31, 2020 were excluded from the study. Moreover, due to the lack of the reporting of data on the number of COVID-19–related deaths in individual voivodeships from April to October 2020, these data were not assessed.

Data on meteorologic conditions

For each of the 16 voivodeships, a single synoptic station of the Institute of Meteorology and Water Management – National Research Institute was selected to determine the effect of weather conditions on regional differences in the transmission dynamics of the COVID-19 pandemic in Poland. For each voivodeship, we selected a representative synoptic station that was located nearby a city with the highest number of residents and population density. Such an approach is typically used in climate analyses in Poland. From each station, data on the following 6 meteorologic parameters were obtained for further analysis: daily maximum temperature, daily minimum temperature, variability of daily temperature, sunshine duration, relative humidity, and wind speed.

Due to the large geographic span of Poland, various meteorologic conditions are observed across the country. Therefore, the Białystok synoptic station was selected to represent the meteorologic conditions in the north-eastern part of the country, and the Kraków-Balice synoptic station was chosen to reflect the synoptic situation in southern Poland. These stations were used as representative for both Polish regions in previous climate analyses.18,19 Moreover, both stations have a long record of meteorologic measurements and observations.

Statistical analysis

To investigate correlations between meteorologic parameters and the number of new COVID-19 cases, cross-correlation function (CCF) was applied separately for each of the 16 voivodeships. A simple correlation provides information on a relationship between 2 variables: a change in one variable provokes a specific change in the other variable. The direction of change may be the same or opposite, resulting in a positive or an inverse correlation, respectively. Every correlation is represented by a sign and a form. It can reach values ranging from –1 (inverse correlation) to 1 (positive correlation). Apart from a simple value of correlation, the CCF also provides information on a time lag until the maximum value of the correlation is reached. In this study, we used the CCF method based on a fast Fourier transform with a 95% confidence interval for all meteorologic parameters.

The relationship between weather conditions and the incidence of new COVID-19 cases was modeled using multiple linear regression. To build the statistical model, a moving 14-day window was used with a 1-day step and the appropriate time lags provided by the CCF method were utilized for all meteorologic parameters and voivodeships. The quality of the model was assessed using the absolute value of a correlation coefficient (R) calculated between the modeled and observed values. The greater the R value, the better the model is fitted to the real data and the better its prediction. R is represented as a value between 0.0 (lack of fit) and 1.0 (highest fit). We also adopted the F-test and a 0.05 significance level to assess the statistical significance of the results. Based on R and F-test results, we selected the best data sets for the regression model separately for each voivodeship. A detailed description of the multiple linear regression methods applied for the modeling of COVID-19 incidence by voivodships is presented in Supplementary material.

We used a random forest model (ie, an ensemble machine learning method based on constructing multiple decision trees) to estimate the overall effect of different parameters on the number of hospitalizations due to COVID-19. A database with COVID-19 hospitalizations, meteorologic parameters from synoptic stations, mobility measurements, and the level of administrative restrictions was randomly sampled into a training (70%) and testing (30%) part of the learning process. The random forest model was described in detail in our previous study.15

Data were analyzed using IBM SPSS Statistics v 25 (IBM, Armonk, New York, United States), R (R Core Team, Vienna, Austria), and MATLAB (MathWorks, Inc, Natick, Massachusetts, United States).

Ethics

The study was conducted in accordance with the principles of the Declaration of Helsinki. Epidemiologic reports on COVID-19 are publicly available on the website of the Polish Ministry of Health.10 The need for patient consent was waived because this study constituted a secondary statistical analysis. Moreover, as we used anonymous datasets, it was not possible to identify any individual study subject by the research team at any stage of the study.

Results

The number of new hospitalizations due to COVID-19 was correlated with daily maximum temperature, daily minimum temperature, variability of daily temperature, sunshine duration, relative humidity, and wind speed (Supplementary material, Figure S1). The random forest model was trained to predict new hospitalizations in Poland based on meteorologic parameters measured at synoptic stations, mobility data, and the level of restrictions (Figure 1). The analysis indicated that the 3 most important parameters are variability of daily temperature, relative humidity, and daily maximum temperature. Lower importance was observed for mobility and stringency index (level of restrictions). The least important parameter was wind speed. The daily maximum temperature and sunshine duration showed an inverse cross-correlation coefficient with a time lag of 10 days (CCF = –0.48 and CCF = –0.45, respectively). For daily minimum temperature and wind speed, the values of CCF were low and near the confidence interval limit (Supplementary material, Figure S1).

Figure 1. Random forest model showing variable importance for predicting hospitalizations due to COVID-19

Abbreviations: Tmax, daily maximum temperature; Tmin, daily minimum temperature; Tvar, variability of daily temperature

Cross-correlations between the number of COVID-19 cases in individual voivodeships and meteorologic parameters, April–October, 2020

The CCF method was used to investigate correlations between the meteorologic parameters and new cases of COVID-19 in each voivodeship. Correlations between the daily number of laboratory-confirmed COVID-19 cases and selected meteorologic parameters are shown in Figure 2. In general, the strongest cross-correlation coefficient in all voivodeships was shown for relative humidity and daily maximum temperature (CCF = 0.41 and CCF = –0.41, respectively). The incidence of new COVID-19 cases was also correlated with variability of daily temperature (CCF = –0.40) and sunshine duration (CCF = 0.35). For all parameters, a similar time lag of 10 to 14 days was obtained. The CCF values for daily minimum temperature and wind speed were low and near the confidence interval limit. Of note, the correlation between wind speed and the incidence of new COVID-19 cases was significant only for Lubuskie, Wielkopolskie, and Podkarpackie voivodeships. The strongest negative effect of reduced daily maximum temperature on the daily number of laboratory-confirmed COVID-19 cases was noted for Dolnośląskie (CCF = –0.46) and Śląskie (CCF = –0.47) voivodeships.

Figure 2. Cross-correlations between the number of new laboratory-confirmed COVID-19 cases and meteorologic parameters: daily maximum temperature (1), daily minimum temperature (2), variability of daily temperature (3), sunshine duration (4), relative humidity (5), and wind speed (6) for the 16 voivodeships of Poland.

Modeling of COVID-19 cases in voivodeships by meteorologic parameters

The distribution of R for significant models for each voivodeship is presented in Supplementary material, Figure S2. Days with exceptionally high correlation values during the study period were also identified, as presented in Figure 3.

Figure 3. Days with the highest agreement between the modeled and observed new cases of COVID-19 (indicated as red stripes for each voivodeship). Black crosses indicate days with the statistical significance of the model.

Our results indicated that weather conditions may have a significant impact on the incidence of new cases of COVID-19 in the different regions of Poland. Therefore, we also examined whether specific meteorologic conditions in one part of the country (ie, Małopolskie voivodeship) were related to an increase in the number of new COVID-19 cases in this region, as compared with another part of Poland (ie, Podlaskie voivodeship) with different weather conditions. Data on new daily laboratory-confirmed COVID-19 cases per 100 000 residents for Małopolskie and Podlaskie voivodeships are presented in Supplementary material, Figure S3. During the first wave of the COVID-19 pandemic, the increase in the daily number of laboratory-confirmed COVID-19 cases per 100 000 residents was comparable between the voivodeships. However, markable differences were observed during the second wave (September–October 2020), which may be related to differences in weather conditions between these voivodeships during this period. The weather conditions (sea-level pressure, temperature, relative humidity, precipitation, and wind speed) in September and October 2020 for the Krakow-Balice and Białystok stations are presented in Supplementary material, Figure S4.

The ratio of the number of new daily cases in area I (Małopolskie) to the number of new daily cases for area II (Podlaskie) is shown in Figure 4. From September 1, 2020 to September 25, 2020, the cumulative number of cases showed a similar trend for both areas. Therefore, the ratio of new cases was almost constant. From September 25, 2020 to October 6, 2020, the ratio of new cases decreased at a constant rate and then again remained constant until October 13, 2020. Subsequently, it increased until October 18, 2020, then remained constant until October 28, 2020, and, subsequently, it started to decrease rapidly. The characteristic points in Figure 4 (indicated by red columns in the plot) are correlated with characteristic weather conditions that occurred with a 10–14-day lag in only one of the selected areas (Małopolskie or Podlaskie) in this study. First, on September 13, 2020, there was a cold front in area II, leading to a drop of temperature, reduced amount of sunshine, and higher humidity, while area I remained under the influence of a high-pressure system with calmer weather conditions. These differences coincided with the first decrease of daily and cumulative number of new COVID-19 cases 12 days later (Figure 4). The second example is a very active low-pressure system over southern Poland (area I) on September 27, 2020, which was related to changes in the relationship on the presented curve 10 days later. Another active low-pressure system was present over area I on October 1, 2020, which correlated with an increase in the presented ratio of new confirmed cases 12 days later in area I to those in area II. The final example is a low-pressure system occurring only over area II on October 18, 2020, which was related to a decrease of the curve 10 days later (Figure 4).

Figure 4. The ratio of the number of daily new cases per million citizens in area I (Małopolskie voivodeship) to the number of daily new cases in area II (Podlaskie voivodeship) together with weather conditions related to changes in the dynamics of COVID-19 transmission in these regions (blue arrows indicate the characteristic dates on the curves in the middle of the column for each synoptic map). Black line indicates the daily number of new COVID-19 cases; green line, the cumulative number of new COVID-19 cases from the beginning of the pandemic. Red columns indicate a time lag of 10 to 14 days between weather conditions and changes in the dynamics of COVID-19 transmission.

Assuming that the relative number of cases presented in Figure 4 was affected by local weather conditions, we can conclude that the low-pressure system that was active in area I on September 27, 2020 might have accounted for an increase in the number of new COVID-19 cases in this region, compared with area II. One week after that event, the relative number of new cases was equal to 1.75, but if the ratio would have been decreasing as in the period from September 25, 2020 to October 6, 2020, it would be equal to 1.55. Knowing how many new cases were confirmed in area II, we calculated that after 1 week from the passage of the low-pressure system, the number of new cases in area I increased from 15 220 to 17 276, that is, by 13%. With an exponential increase in the number of new confirmed cases, we believe that such an influence of weather conditions may be important in the subsequent weeks. In the presented example, after 2 weeks, it would account for a 28% increase in the total number of new confirmed cases in area I.

Discussion

To our knowledge, this is the first study to assess the impact of meteorologic conditions on the transmission dynamics of the COVID-19 pandemic in 16 voivodeships in Poland. Of the 6 meteorologic parameters, relative humidity (CCF = 0.41 with a time lag of 10–13 days) and daily maximum temperature (CCF = –0.41 with a time lag of 10–14 days) had the strongest impact on the dynamics of COVID-19 transmission in all voivodeships. The impact of individual meteorologic parameters (value of correlation) on the daily number of laboratory-confirmed COVID-19 cases was comparable between voivodeships, except for wind speed. A comparable impact of meteorologic conditions on the dynamics of COVID-19 transmission was observed in the neighboring voivodeships. Moreover, in Dolnośląskie and Śląskie voivodeships, a reduced daily maximum temperature had the strongest negative effect on the daily number of laboratory-confirmed COVID-19 cases. Our analysis also showed that significant meteorologic phenomena (eg, cold fronts or very active low-pressure system that are typical for the autumn season) occurring in one part of the country can lead to an increase in the daily number of COVID-19 cases at the regional level. Results from the random forest analysis allowed us to assess the overall importance of different meteorologic parameters on the number of hospitalizations due to COVID-19. These findings may be used in pandemic forecasting, both at the national and regional levels. Moreover, we obtained detailed data on the days when meteorologic conditions were favorable for the transmission of the SARS-CoV-2 infection in each of the 16 voivodeships.

Global evidence suggests that COVID-19, like other viral respiratory infections, could be a seasonal disease.12,13,20-22 Human activity as well as seasonal variability in human immune function may also influence the seasonality of COVID-19.23

Most studies examining the associations between weather and COVID-19 focused on whole countries or selected cities.11,12,15,24 However, if relevant meteorologic data are available, the dynamics of COVID-19 transmission may be assessed also for different administrative regions within the country. Our study revealed that the impact of the meteorologic parameters on the incidence of new laboratory-confirmed infections varied between the voivodeships. Moreover, we observed the delayed effects (10–14-day lag) of the meteorologic factors, which is in line with the incubation period of SARS-CoV-2 as well as the typical course of COVID-19.1,2

Climate may affect the spread of COVID-19.25,26 It is estimated that the cold season in Southern Hemisphere countries accounted for a 59.7% increase in the total number of COVID-19 cases, while the warm season in Northern Hemisphere countries contributed to a 46.4% reduction of total COVID-19 cases.25 Moreover, it was estimated that a country located 1000 km closer to the equator might record a 33% lower number of cases per million inhabitants.26 However, examples from countries with high average ambient temperatures (eg, Australia and the United Arab Emirates) showed that climate variables alone cannot mitigate the spread of COVID-19 in the absence of pandemic interventions. Globally, 41.2% of all COVID-19 cases and 45% of COVID-19–related deaths have been reported in high-income countries, with the population of high-income countries representing approximately 15% of the world’s population.9 On the other hand, less than 1% of all COVID-19 cases and deaths have been reported in low-income countries, while approximately 9% of the world’s population lives in these countries. This observation suggests that because of limited access to diagnostic tests (real-time reverse-transcription polymerase chain reaction tests) and poor healthcare development, the number of COVID-19 cases and COVID-19–related deaths in low-income countries may be significantly underestimated.

The impact of meteorologic conditions on the dynamics of COVID-19 transmission may result both from the direct impact on SARS-CoV-2 as well as from human behavior that determines viral transmission. Experimental data showed that SARS-CoV-2 is highly stable in cold environments but sensitive to increased temperature.27 Moreover, SARS-CoV-2 survives longer in the environment with lower relative humidity.28 This is in line with the results of our previous study15 based on real-time meteorologic data. However, the relationship between the dynamics of COVID-19 transmission and weather is more complex, because weather affects also social behaviors such as mobility, interactions with other people, spending time indoor, as well as visits to shopping malls or restaurants.29 Meteorologic conditions may also determine the ventilation of indoor environments, which is crucial for limiting the indoor transmission of SARS-CoV-2. Moreover, socioeconomic differences between regions, for example, high urbanization and industrialization in Śląskie voivodeship or tourist traffic in voivodeships located by the Baltic Sea or the Tatra Mountains, may also affect the regional differences in the transmission dynamics of the COVID-19 pandemic.

Our current study has practical implications for policymakers and medical professionals. Epidemic forecasting is an important part of developing pandemic measures aimed at mitigating the spread of SARS-CoV-2 infections as well as the management of healthcare resources (hospital beds, medical equipment, oxygen). An exponential spread of the COVID-19 pandemic means that even a small increase in the number of new cases at the beginning of a new wave translates to a large difference in the number of patients at the end of that wave. Our data on the days when meteorologic conditions are favorable for the transmission of infections can be used by epidemiologists to forecast the development of the COVID-19 pandemic at the regional level. Moreover, information on weather conditions conducive to SARS-CoV-2 transmission might be included in weather forecast in the media. The availability of such information for a given voivodeship might help develop specific recommendations for patients at high risk of COVID-19. In addition, predicting hospital bed occupancy depending on meteorologic conditions may provide important information for healthcare facility managers. Our results based on the machine learning model are in line with a similar study on the association between weather data and the COVID-19 pandemic predicting the mortality rate.16

This study has several limitations. First, only laboratory-confirmed COVID-19 cases were reported in Poland during this time period, and we used data published by the Polish Ministry of Health. However, considering that there are asymptomatic cases and that a growing number of people fail to do the COVID-19 test despite respiratory symptoms, it is possible that the real number of COVID-19 cases in Poland may have been higher. Second, the recorded outbreaks of the infection in voivodeships (eg, among miners in Śląskie voivodeship) and the local implementation of population screening tests for COVID-19 may have affected the epidemiologic data on the pandemic (especially the number of laboratory-confirmed cases of COVID-19). Third, we did not assess the potential impact of pandemic measures on the dynamics of COVID-19 transmission due to the lack of national data on the effectiveness of the subsequent measures. Moreover, the role of the socioeconomic factors and health status on the COVID-19–related mortality rate was not assessed due to the lack of public access to anonymized data on individual patients with COVID-19.

In conclusion, this study showed that weather has a significant impact on the transmission dynamics of the COVID-19 pandemic at the regional level. We can estimate that up to 10% to 12% of COVID-19 cases can be related to weather conditions and their impact on the transmission of infectious diseases with a time lag of 10 to 14 days. Relative humidity and daily maximum temperature had the strongest impact on the dynamics of the COVID-19 pandemic in each of the 16 voivodeships in Poland. Moreover, the study revealed the delayed effects (10–14-day lag) of the meteorologic factors on the incidence of COVID-19 and the risk of hospitalization for COVID-19. Finally, our study confirmed that meteorologic parameters should be applied in the COVID-19 forecasting models as the key factors for improving the accuracy of the forecasts at the regional level.