How difficult time it is due to COVID-19 pandemic can be determined by its mortality and/or recovery rates. Predictive modeling can help us forecast how big the impact will be, especially for human lives? Prediction is one of the well-known studies that is entirely relying on machine learning based data analytics tools and techniques. In this study, we revisit mathematical models to measure severity levels of a COVID-19 pandemic.

A pandemic is an epidemic occurring on a scale that spreads rapidly across the world. The World Health Organization (WHO) considers epidemic diseases are Chikungunya, Cholera, Crimean – Congo hemorrhagic fever, Ebola Virus disease, Hendra virus infection, Influenza, Lassa fever, Plague, COVID-19, SARS, etc. In 2014, the United States Centre for Disease Control and Prevention (CDC) announced an equivalent framework to the WHO’s pandemic stages titled pandemic intervals framework [1,2,3], where two pre-pandemic and four pandemic intervals were reported. Investigation and recognition are pre-pandemic intervals. Initiation, acceleration of diseases, deceleration, and preparation are pandemic intervals. In a similar fashion, instead of using pandemic intervals, we take recovery time period into account. Recovery time period can be two days, a week, two weeks, a month, six months, a year or any finite number of days. For COVID-19, we consider an average recovery time period of 14 days [4].

In March 2020, Baud et al. [4] reported to take 14 days delay into account in order to compute right mortality rate during the COVID-19 pandemic. Their study suggested that the classical mortality rate undervalue the probable threat due to COVID-19 in symptomatic cases. Of all metrics, to measure an austerity of the COVID-19 pandemic, in addition to mortality rates, recovery rate is considered. Both, mortality and recovery rates are useful metrics only when recovery time period is considered. Practically, one cannot compute mortality and recovery rates without considering its recovery time period. Inspired from previous work [4], we revisit mathematical models for both mortality and recovery rates, where not recovered cases and recovery time period are discussed thoroughly with both synthetic and real datasets.

In the sequel, for any disease, mortality rate (MR) defines the probability of death, and can be expressed as, \( \mathrm{MR}=\frac{\mathrm{D}}{\mathrm{N}}\times 100, \) where D and N refer to total number of deaths and infected people, respectively. While computing the MR, we consider the total number of an infected people till date, which will superfluously increase the denominator and hence decreases the MR. Therefore, practically, it is observed that the classical mathematical equation (to compute MR) may deviate from what it should be, since it does not take recovery time period into account. This primarily motivates us to revisit classical mathematical models.

Since recovery time periods vary a lot from person to person (of course, demography dependent, as reported in WHO [5]), it is important to define an average recovery time period (Pavg) as follows:

$$ {\mathrm{P}}_{avg}=\left\{\begin{array}{cl}0,& \mathrm{if}\ \mathrm{RP}\approx 0,\infty \\ {}\mathrm{Arithmetic}\kern0.5em \mathrm{Mean},& \mathrm{otherwise},\end{array}\right) $$

where we do not practically consider recovery time period (RP) on the exact same day cases are tested positive and those who are not recovered for months/years.

While revisiting, since mortality rate changes over time during COVID-19 pandemic, we call it Progressive Mortality Rate (PMR). In general, for any case/pandemic, C with an average recovery time period of Pavg, PMR can be computed as,

$$ \mathrm{PMR}=\frac{{\mathrm{D}}_{\mathrm{c}}}{{\mathrm{N}}_{{\mathrm{P}}_{avg}}}\times 100, $$

where Dc denotes the total number of death cases due to C disease and Npavg refers to a total number of infections before average recovery time period Pavg.

Similarly, Recovery Rate (RR) used to follow classical mathematical equation, \( \mathrm{RR}=\frac{\mathrm{R}}{\mathrm{N}}\times 100 \), where R refers to total number of recovered cases. As before, when recovery time period is taken into account, Progressive Recovery Rate (PRR) is,

$$ \mathrm{PRR}=\frac{{\mathrm{R}}_{\mathrm{C}}}{{\mathrm{N}}_{{\mathrm{P}}_{avg}}}\times 100, $$

where Rc refers to total number of recovered cases till date.

As a result, it is observed that we have notable differences between classical and revisited (progressive) rates. With \( {\mathrm{N}}_{{\mathrm{P}}_{avg}} \), progressive rates, as compared to classical rates, vary a lot during the pandemic period. The primary reason behind that is, for any ith day, progressive rates are calculated by taking data from (i-14)th day plus those recovered and death cases into account, where for COVID-19, we consider Pavg = 14. Further, classical and progressive rates meet each other at the end i.e., (j + 14)th day, where no new case is tested after jth day. This means that mortality rates can only be determined by taking next average recovery time period from the day no new case(s) is(are) tested.

In previous correspondences [4, 6,7,8], the 14 days delay has been well discussed. However, since we do not have complete COVID-19 pandemic scenario, it is not trivial to get the real estimates from their models even though they look conceptually correct. To avoid possible confusions, we build a synthetic data, where we consider the first day having COVID-19 positive cases to the end of COVID-19 pandemic (see Table 2 in Appendix). Using our synthetic data, Fig. 1 shows how differences can happen when we take recovery time period into account. On 14th day, PMR and PRR are computed. Both classical and progressive rates meet each other when no new case is tested positive. As a result, these rates can be determined after 14 days from the day when no new case is tested positive. Interestingly, both classical and mathematical models hold exact same value at the end.

Fig. 1
figure 1

Relation between classical and progressive rates using synthetic data, where one can find a complete scenario from day 1 to day 30, where no new case is tested positive after 30th day

Mathematically speaking, PMR and PRR are disjoint sets. In other words, the set U is the union of PMR and PRR, and the percentage of the cases that are Not Recovered (NR) within Pavg. As a result, the cardinality of U is 100. Since recovery time period varies from person to person and is based on the severity level of infection [16], NR can be either positive, negative or zero:

$$ \mathrm{NR}=\left\{\begin{array}{ll}\le 0,& \mathrm{if}\ \mathrm{RP}\le {\mathrm{P}}_{avg}\ \\ {}>0,& \mathrm{otherwise}.\end{array}\right) $$

As before, PMR, PRR, and NR are disjoint sub-sets. As mentioned earlier, since recovery time period can be less than, equal to and more than Pavg, Progressive total (PT) can be computed as

$$ \mathrm{PT}=\left\{\begin{array}{ll}\le 100,& \mathrm{if}\ \mathrm{NR}\ge 0\ \\ {}>100,& \mathrm{otherwise},\end{array}\right) $$

where PT changes with NR. On the whole, PT = PMR + PRR.

The selection of Pavg is crucial as it impacts progressive total, which is not the case when we compute classical MR and RR. In Fig. 2, using synthetic data (like in Fig. 1), we have a complete understanding of PRM and PRR from day 1 to day 44, where 30th day is the last day when no new COVID-19 positive is tested. Specifically, a) from day 14th to 18th and 38th to 42th, cases are recovered within Pavg and therefore, NR is negative; b) from day 19th to 37th, cases require more than 14 days to recover and therefore, NR is positive; and c) from 43rd to 45th, cases are recovered exactly on 14 days and therefore, NR is zero. In a nutshell, it is expected to have NR ≤ 0 for higher recovery rate. In other words, NR depends on PMR and PRR. With this note, PT reaches 100% when NR is zero. If NR < 0, PT crosses 100%. This means that when PT crosses 100%, infected cases are recovered in less than Pavg.

Fig. 2
figure 2

Different values of Not Recovered (NR) cases in accordance with RP and Pavg (data source: Table 2, appendix) in both PMR and PRR

For the exact same scenario (Pavg = 14 days), considering the real-world data (see Table 1), one can see reasonable differences between classical and progressive mortality/recovery rates. In Table 1, countries are ranked in accordance with the increasing order of the mortality rate. Let us summarize the observations so one can understand how PMR, PRR, and PT rely on NR, where Pavg is crucial. Two cases can be summarized as follows:

  1. a)

    Pavg ≤ 14: For Canada and Brazil, since PT is close to 100% (100.26%, and 100.458%, respectively), an average recovery time period is 14 days i.e., Pavg ≈ 14. In other words, all infected cases take 14 days (on an average) to get recovered. While for India, Mexico, and Japan, since PT is 109.29%, 108.39% and 103.9% respectively, which is more than 100%, an average recovery time period is less than 14 days i.e., Pavg < 14.

  2. b)

    Pavg > 14: For remaining countries (from Table 1), since PT is less than 100%, infected cases are taking more than 14 days to get recovered.

Table 1 Computing MR, RR, PMR, PRR, and PT during the COVID-19 Pandemic

On the whole, in this study, considering recovery time period, we are not just limited to provide mathematical models for progressive mortality and recovery rates but also analyze the importance of recovery time period during the pandemic. We have provided our analyses by using both synthetic as well as real-world data.

At this point, it is important to note that even though the recovery time periods vary from one case/pandemic to other, the exact same mathematical models can be used to compute mortality and recovery rates.