Introduction

The SARS COV-2 (COVID-19) is spreading throughout the world at an exponential rate. It is unfathomable, posing a great challenge to the entire humanity. The virus is spreading at an exponential rate across the globe, resulting in a pandemic. The world is under lockdown, which is further resulting in people losing their jobs, leading the world to catastrophe. All world leaders, the entire scientific community, doctors, health workers are working relentlessly to handle this menace.

By the end of May 2020, there are a total of 6.05 million cases and 369 k deaths in the whole world that occurred due to COVID-19 [37]. This pandemic started in the Huanan seafood wholesale market of Wuhan is now showing abominable dominance in the world. The first case was reported on the 1st of December 2019 in the Hubei Provincial Hospital. The virus spreads through the droplets produced as a result of coughing or sneezing of a COVID-19 infected person [39].

As W. Edwards Deming rightly commented ‘You Can’t Manage what you can’t measure’, a metric to measure the pandemic penetration level in a specific geographic location will be an important and essential tool to handle any pandemic. Simple statistics such as the number of cases, and number of deaths, and comparison of these statistics across different countries does not suffice. Scientific measurement of the pandemic penetration is essential to formulate the pandemic risk mitigation plans. Deriving a proper metric to measure the pandemic penetration will not only help the administrators of different individual countries but many global organizations such as WHO, World Bank & IMF, to provide the country-specific advisories., as the pandemic penetration will depend on so many individual factors. We propose a metric to understand the severity of the pandemic penetration, which may be termed as Pandemic Penetration Index (PPI). Efficacy of such a metric will depend on the in-depth exploration, identification of various individual components/factors, estimation of their values and finally providing the suitable weightage to these individual factors.

The objective of the current study is to explore various important causative factors for pandemic penetration and establish the relevance of these factors based on the available data of COVID-19. Apart from the current COVID-19 pandemic, these findings may be extended to all the future pandemics being faced by humanity. The rest of the paper is structured as follows. Section “Literature Survey”, presents the details of the literature survey, Sect. “Factors for Pandemic Penetration” explores various factors which contribute to the pandemic penetration and establishes the relevance of these factors based on the statistical analysis of current COVID-19 pandemic data. Finally, in Sect. “Summary and Future Work”, the results were summarized.

Literature Survey

The research papers and articles that have been published about the pandemics are analyzed. The modelling of data is essential to predict the percentage of the population in danger and can get infected during a pandemic. The SIR model is useful in predicting the intensity of the disease spread [3]. It can predict how much of the population is in danger and how many people can get infected. To establish a metric, we need to figure out the factors that contribute to the required metric. (Modelling the global spread of diseases: a review of current practice and capability) suggests the inclusion and exclusion criteria for the factors [4].

It is crucial to make use of the available data and to define the thresholds for influenza to be declared as a pandemic (Establishing thresholds and parameters for pandemic influenza severity assessment, Australia) which made the protocol for doing so [2]. The preparation for influenza is based on how deadly the virus is. Moreover, it is measured in different ways, such as the number of people absent for their regular work who reported ILI and number of people who had to take intensive care unit admission per 100 confirmed influenza admissions. This method aims at tracking the activity of the virus, comparing it with the records it predicts the impact that can be caused by the virus and intimates the authorities for precautionary measures [2]. There are numerous models which are used for describing the spread of infectious diseases [6]. One of the well-defined models in this field of study is “Susceptible-Exposed-Infectious Recovered” model (SEIR model) [42].

At present the governments have mostly relied on Lockdowns and restricting international travel for tackling the pandemic (e.g., SARS-COV 2 coronavirus), the studies suggest that the Basic Reproduction Number (BRN) of the COVID-19 (coronavirus) is around 2.4 and could conclude that due to the implementation of lockdowns, the BRN can be reduced to 0.9. However, in the worst case if the BRN rises to 3, then it would be challenging to lower it down to 1 with the implementation of lockdowns [7].

Geographic Spreading Centrality (GSC) is a metric based on the influential spread of infectious disease at a global level by long-range transport that is Air Transport. They proposed a new metric known as Geographic Spreading Centrality (GSC) [1]. The variables considered here are—network topology, geography, traffic structure and individual mobility patterns. As the research is based on the spread of the virus through air transport, they collected the detailed data of flights from the Federal Aviation Administration (FAA) for three years and four months of individual travel itineraries [1]. Furthermore, the study (Effectiveness of travel restrictions in the rapid containment of human influenza: a systematic review) concludes that Widespread movement restrictions can delay the spread of the flu, but cannot prevent it [5]. Concluding the contact tracing and analysing the mobility patterns are the most important part of preventing the virus.

Factors for Pandemic Penetration

In our analysis, we intend to explore various factors that have a significant impact on the spread of SARS-CoV-2 in particular and any pandemic in general. These factors can be broadly categorized into static and dynamic factors, where static factors are those, whose value does not change significantly during the pandemic period and dynamic factors will change their value significantly during this period. Further, these factors may also be classified as socio-economic, cultural, environmental and finally pandemic specific factors. The WHO Commission on the social determinants of health has detailed some of the important social determinants of the health, as mentioned in the following picture (see Fig. 1) [40]. Health also encompasses immunity enhancement, which in turn contributes ineffective handling of the pandemic. Thus, all the social determinants for health are also relevant for pandemics, though these factors do not completely address the cause for pandemic penetration.

Fig. 1
figure 1

General socio-economic, cultural and environmental conditions [40]

Various possible factors for the pandemic penetration were explored and listed below.

  1. 1.

    Socio-economic factors: Density of population, law & order, literacy rate, media and access to the internet, Gross Domestic Product (GDP), Global Health care access and Quality Index (HAQ).

  2. 2.

    Cultural Aspects: People habits (superstitions, food habits, hygiene), Travel – Usage of public transport, social contact index (SCI).

  3. 3.

    Environmental factors: Climatic conditions like temperature, humidity

  4. 4.

    Virus characteristics: R Naught, availability of medicines and vaccine, incubation period of the virus.

  5. 5.

    Other Factors: Lead time for Pandemic Reach, economic and cultural proximity to Pandemic Origin or Hotspot location, Geo-strategic importance of the pandemic originating country or hotspot areas.

Significance of some of these important individual factors has been established based on the current COVID-19 related available data. Data of individual factors are sourced from the available relevant data in the open domain. This study is a prerequisite to devise suitable mathematical model for the pandemic penetration measurement and subsequent extrapolation. For this study, data from eight countries viz., USA, Spain, Italy, Germany, Iran, India, China and South Korea were considered. All the chosen countries have dealt with the illness in different ways and are now having mixed results.

Density of Population

This factor determines the number of people living per square kilometre of area. If the value of this factor is high, then the spread of the virus can be high since it is difficult to maintain social distancing among the population. Studying the current trend, it’s being seen that SARS COV-2 spreads more in the metropolitan cities. So, for the analysis of this factor, we are selecting the worst-hit cities of the countries taken under consideration. This, in turn, will give us a better value of this factor to compare among all the countries.

In the case of the USA, most of the available data is given in terms of states. So, we took the first ten worst-hit [37] cities of each state and found the average population of them. This way, we can get closer to realistic values of the population density. According to the data of China it has been seen that approximately 85% of the virus spread was in the Hubei Province [37]. This shows that the analysis must be mainly done on that region. There was no significant virus spread in the other provinces of the Country. Therefore, the population density of the capital cities of the provinces was considered for analysis.

If we consider a single country, then we can find that the number of cases is varying linearly with the increase in the population density of the respective cities and states (for most of the countries) (see Fig. 2). It can be observed that the SARS COV-2 cases are not varying linearly with the change in the population density of the countries (see Fig. 3). We can find absolute linearity when we compare the USA, Spain, Italy, Germany and Iran (see Fig. 3). But when it comes to India and South Korea, we can see that they are not affected according to the population density they are having (see Fig. 3).

Fig. 2
figure 2

Population density vs the selected cities (arranged in the decreasing order of the SARS COV-2 cases) [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23]

Fig. 3
figure 3

Population density vs countries (arranged in decreasing order of SARS COV-2 cases)

According to this, we can conclude that this factor is essential to a certain extent. However, there must be some more critical factors which have a significant influence on the spread of the SARS COV-2.

According to the number of cases of SARS COV-2 till mid of July India moved up to the 2nd position (in the countries considered for analyses) with 1 million cases followed by Iran (0.26 M cases). The USA remains in the first position, this depicts that population density has an effect on the virus spread.

Global Healthcare Access and Quality Index (GDP & HAQ)

GDP does not have any direct effect on the spread of the virus. However, countries with high GDP can inject more money into the health sector at the time of crisis, get all the equipment like Masks, PPE’s and Ventilators (in case of SARS COV-2) which are essential for fighting against the virus. High GDP would be an advantage for the countries to reduce the impact of viruses on humanity.

This factor determines the healthcare quality and ease of access to healthcare in various countries. The HAQ index is given by Institute for Health Metric and Evaluation (IHME) [24]. From the data, it could be concluded that the excellent health care facilities in the country cannot prevent the SARS COV-2 from spreading. However, it can reduce the mortality rate of humanity.

In the case of Germany, South Korea, Iran and China it can be observed that their mortality rate is low compared to Spain and Italy, though the HAQ of all these countries is almost equal (see Fig. 4). This concludes that there must be some more factors which have a more significant role in determining the mortality rate. In India, the mortality rate is less because they got enough time to prepare for the worse.

Fig. 4
figure 4

HAQ index and mortality rate vs the selected countries (arranged in the decreasing order of the SARS COV-2 cases)

Average Age and Life Expectancy of the Population

Average age of the country has been found to be a potential factor for the high mortality rate due to SAR COV-2. The data of the Average age and Life expectancy [26] of each country is compared with the mortality rate, which was calculated by the total number of deaths over total number of confirmed cases of SARS COV-2 [37].

It can be observed that the life expectancy and the median age of the population in the countries affect the mortality rate (see Fig. 5). Mainly observing the case of Italy and Spain, it can be found that they have a high mortality rate as their life expectancy and the median age of the population is higher compared to the others. However, in the case of South Korea, the government policies were very successful in containing the virus, and hence they are the least affected one.

Fig. 5
figure 5

Mortality rate, average age and life expectancy vs countries (arranged in the decreasing rate of SARS COV-2 cases)

Usage of Public Transport

Usage Public transport is an essential factor to be considered for modelling [25]. Higher the usage of public transport, higher is the spread of the virus. This happens because, in any public transportation system, it is difficult to maintain social distance. People come into close contact with each other, which becomes a likely event for virus spread.

It can be observed that the usage of public transport is not directly related to the pre-assumed dogma (see Fig. 6). As we can see the usage of public transport in the USA is less compared to the other countries though it’s having the greatest number of SARS COV-2 cases (see Fig. 6) [25].

Fig. 6
figure 6

Percentage of population which uses public transport vs countries (arranged in the decreasing rate of SARS COV-2 cases)

In contrast, it can be observed that the usage of public transport is quite high in countries like Spain and Germany (see Fig. 6) [25], where the number of infected cases is also high. However, the number of SARS COV-2 cases is not totally dependent on public transportation; there are many other factors on which the spread of the virus is dependent. The policies taken by the government at the initial time of the spread affects a lot, if lockdowns are implemented properly in a timely fashion, the weightage for the usage of public transport is almost nullified.

In the case of India, the rate of increase of SARS COV-2 cases showed a sudden spike once the lockdown was lifted. Unlock 1 in India was started from June 1st and since then the SARS COV-2 cases accounted for 90% of the total [41]. India has the highest usage of public transport (see Fig. 5), and hence when the lockdown was lifted people started travelling which had an impact in the rise of SARS COV-2.

People Travelling to the Affected Country and the Residents of the Affected Country Travelling to Other Countries

This factor calculates an index based on travelling records between different countries. The affected country is China, as the SARS COV-2 outbreak started there. So, we are calculating the travelling index of various countries taken into consideration [27,28,29]. Some of the travel data of people from countries like Spain, Italy, Germany and Iran to China were not available, so we added a 0.5 million in each of them to maintain similarity with the countries whose data were available. The travel data of Chinese people from China to other countries are readily available [28] (see Fig. 7).

Fig. 7
figure 7

No. of people travelled in and out of china vs from countries (arranged in the decreasing rate of SARS COV-2 cases)

It can be interpreted that the number of cases is proportional to the travelling data of the respected countries with the hotspot—China (see Fig. 7). Contrasting results in the case of Spain and South Korea may be observed, as the graph does not seem to match the trend (see Fig. 7). In the case of South-Korea, immediate lockdown and sealing of the international borders helped them in containing the virus to a great extent. Higher number of cases of Spain may be attributed to greater movement with other hotspot countries like Germany and Italy [38], if not with China.

This factor is one of the most critical elements in forming the pandemic penetration index as it has a considerable impact on the overall cases in the country.

Another sub-factor is the time in the year when people travel the most. It can be observed, most of the European Countries are severely affected [29] (see Fig. 8).

Fig. 8
figure 8

Percentage of travel of a year vs month

This is because of 2 reasons -

  • 1. Huge travelling among each nation.

  • 2. They travel to China in huge numbers in February, March and April (see Fig. 9), which unfortunately comes out to be the peak time of the pandemic.

Fig. 9
figure 9

No. of cases in a country vs date

Lead-time

The preparedness of a country towards a pandemic is largely influenced by the amount of time it is getting for preparing itself on various grounds. As they get time to gather and analyse a lot of data about the characteristics of the virus. Which in turn helps the countries in accessing the severity of the virus and ways to contain it. We have done a detailed survey at the starting of the pandemic by gathering the number of SARS COV-2 cases date and country wise [37]. As our analysis is targeted to analyse the growth of the pandemic in the initial stages, the analysis period was taken from 26 February to 25 March. Hereby, comparing the number of cases among the countries considered (see Fig. 9).

It can be observed that the countries which got the spike in the number of cases, in the beginning, tend to have a higher mortality rate relatively compared to the other countries (see Figs. 4 and 9). In the case of India, it saw a spike in the number of cases on a later stage (20th April 2020), it has a significant time for accessing the situation and taking the correct measures in containing the virus. They also got to know about the medical treatments which have enough potential to cure the patients. All this resulted in India keeping its mortality rate quite low compared to the other countries that is − 3.09% despite its HAQ index being the least among the countries taken into consideration.

According to the number of cases of SARS COV-2 till mid of July India moved up to the 2nd position (in the countries considered for analyses) with 1 million cases but the mortality rate was low (2.5%), compared to other affected countries. In the month of June to mid-July Brazil and Russia had emerged as new hotspots of the SARS COV-2, with 2.01 M and 0.759 M cases respectively [37] but their mortality rate also remained low at 3.8% and 1.6% respectively [37].

Thus, we can conclude lead time is a crucial factor in the pandemic penetration index. The country which gets more time is always on a favourable side.

Literacy Rate and Access to Media and Internet

Literacy rate, access to media and internet, plays an essential role in containing the spread of viruses. Countries with a majority population having access to media and internet can quickly spread awareness on the virus, and literate people would be in a position to understand the gravity of the situation at the time of crisis and take all the precautionary measures to contain the spread of the virus [33].

Limited or insufficient health literacy was associated with reduced adoption of protective behaviours such as immunization, and an inadequate understanding of antibiotics, although the relationship was not consistent [33]. Significant gaps remain concerning infectious diseases with high clinical and societal impacts, such as tuberculosis and malaria.

Climatic Conditions

Understanding patterns of influenza spread is crucial for pandemic preparedness. The H1N1pdm09 virus caused the first influenza pandemic of the twenty-first century, which resulted in at least 18,500 deaths [32]. Based on laboratory-confirmed primary-care case reports, the above research paper had investigated the role of weather conditions and socio-demographic variables in its initial spread and subsequent presence in France. The findings suggest that low relative humidity and high population density were determinants in shaping the early spread of the virus at the national level. Those conditions also favoured the persistence of viral presence throughout the first 33 weeks of the H1N1pdm09 pandemic [32].

Additionally, this persistence was significantly favoured by low insolation. These results confirm the increasingly recognized role of humidity in influenza dynamics and underlie the synergistic effect of insolation. In contrast, in the case of COVID-19, the ongoing research suggests that the climatic conditions have a shallow impact on the spread of the virus. Some reports also suggest the survival rate of the virus has decreased in high temperatures and high humidity levels created artificially in the laboratory though others are sceptical about it [32]. Therefore, climatic factors should be taken into account when designing influenza control and prevention measures depending upon the characteristics of the virus.

Cultural Aspects and Social contact Index (SCI)

People tend to travel more during the festive season as they get holidays. Therefore, this factor will eventually multiply the travelling of people which will, in turn, increase the possibility of the virus spread. In the case of the COVID-19 outbreak, Chinese people had their new year (12th February), which is one of their major festivals.

According to an analysis on travel during Diwali, which is one of the most important festivals in India, it is found that there is a hike of 22% in the bookings of air-ticket from MakeMyTrip (MMT) [34]. Also, there is a hike of 50% in the reservations of air-ticket from Paytm. From research, it is found that IRCTC also includes 58 pairs of special trains during the festival season of Holi [35].

This data makes it clear that during any festival season, people tend to travel more. Hence, we can conclude that this is an essential factor for the indexing as it multiplies the other factors taken into consideration. Although if no cultural events are going at the time of the pandemic, this factor nullifies.

Social contact index determines the number of people one person meets on a typical ordinary day for him. The intensity of the community transfer of a virus is directly proportional to the Social contact index of a country. Community transfer is considered as the third stage of a pandemic and is critical. We think that the social contact index can decide the potency of the virus spread.

Virus Features

Viruses are the main causes of any pandemic. They are the smallest infectious agents ranging from about 20 to 300 nm in diameter having only one kind of nucleic acid (RNA or DNA) as their genome. The nucleic acid is encased in a protein shell, which may be surrounded by a lipid-containing membrane. The spectrum of viruses is rich in diversity. Viruses differ mostly in structure, genome organization and expression, and strategies of replication and transmission. Due to the varying properties of these viruses has direct relation on the impact it can cause to humanity. Impact of the variations among viruses will be reflected through some of the features such as R Naught and Incubation Period.

R Naught is one of the dynamic and epidemiologic factors which is also known as Basic Reproduction Number. This portrays the contagiousness and transmissibility of the virus. For the SARS-COV 2 the R Naught is estimated around 2.4 [7]. As the value of R Naught is dynamic, it can also be increased to 3–4 [7]. Implementation of strict lockdowns is one of the ways of reducing the R Naught [7].

If there is a medicine or a vaccine already available for a disease, then it becomes a lot easier in treating and controlling the disease. But in the case of SARS COV-2, the doctors have mostly relied on providing symptom-based treatment to the infected and had to heavily rely on the human body to develop the antibodies and kill the virus.

Incubation period is essential in the investigation and control of infectious diseases. The detailed analysis of the incubation period should be done as it helps in the quarantine policy-making and pandemic planning. As of the calculations done by the Centers for Disease Control and Prevention (CDC), the incubation period of the SARS COV-2 must be somewhere around 2–14 days [31]. According to their statistical analysis, 97% of the people who contract the disease show symptoms by 11.5 days [31]. Comparing the incubation period of SARS COV-2 with the other viruses like Adenovirus—5.6 days, Human Coronavirus—3.2 days, SARS (2003)—4.0 days [30], we find SARS COV-2 more lethal.

The impact of SARS COV-2 is deadly on old age people, as their immunity is low and also among people with some comorbid conditions such as diabetes, blood pressure, Asthma, etc. Thus, if the average age of the population affected by the virus in a country is low, then the mortality rate would be considerably low in it. In the case of Germany, its mortality rate due to SARS COV-2 pandemic is 4.607 which is significantly lower compared to the countries like Italy and France with a mortality rate of 14.8%, and 15.5%, the average age of infected people in Germany is 49. In contrast, Italy and France, it is 62 and 62.5 [36]. Hence, we can conclude that the average age of infected people significantly determines the mortality rate.

Another virus-related aspect is the maturity level of the diagnosing tests. It is widely observed that the false negatives of the COVID-19 tests are quite substantial, leading to the spread of the disease. On the other hand, increased false-positive results will curtail the spread of the disease.

Apart from the above-mentioned virus features, there are some other lesser-known factors, which are being studied by various researchers to understand its varying impact among different races and ethnic groups, Continuous mutations of the virus and changes on its impact, Blood group dependency, BCG Vaccination impact etc.

Summary and Future Work

Obtaining the main factors responsible for the pandemic penetration is the genesis for the measurement of Pandemic Penetration. As detailed in the previous section, various aspects especially related to socio-economic, cultural, environmental, virus related factors were explored. Impact of some of these factors were established using the present COVID-19 related data and found that the density of the population, usage of public transport has a direct impact on the rise of SARS COV-2 cases. SARS COV-2 had a major impact on elderly people. It is also observed that countries with good HAQ indices has increased the life expectancy of the respective countries’ population. At the same time, as COVID-19 has major impact on elderly people, the mortality rate due to the disease is high. It also proven that, Tourism between the affected country with other regions had a greater influence on the number of cases. Countries which saw the peak of SARS COV-2 cases much later, their mortality rate mostly remained low.

Further, in depth analysis of the factors and ways and means of measuring the individual factors is necessary. Devising a proper mathematical model using the entire data is a major forward step in the measurement of pandemic penetration. As discussed in previous section, the model can better be devised as the function of static factors and dynamic factors. Obtaining the exhaustive list of relevant factors, proper measurement of these individual factors and finding out the relative importance of these factors are the main steps in this direction. Pandemic Penetration Index (PPI) metric, which will be the outcome of this mathematical model can be employed by the policy makers to devise the necessary plans to prevent the pandemic and world health bodies like WHO may use these models in issuing the country/location-specific advisories.