Abstract

In this work, we introduce an improved form of the basic SEIRD model based on Python simulation for the troublesome people who are oblivious about the contemporary pandemics due to diverse social impediments, especially those economically underprivileged. In the extant epidemiological models, some unorthodox issues are yet to be considered, such as poverty, illiteracy, and carelessness towards health issues, significantly influencing the data modeling. Our focus is to overcome these issues by adding two more branches, for instance, uncovered and apathetic people, which significantly influence the practical purposes. For the data simulation, we have used the Python-based algorithm that trains the desired system based on a set of real-time data with the proposed model and provides predicted data with a certain level of accuracy. Comparative discussions, statistical error analysis, and correlation-regression analysis have been introduced to validate the proposed epidemiological model. To show the numerical evidence, the investigation comprised the figurative and tabular modes for both real-time and predicted data. Finally, we discussed some concluding remarks based on our findings.

1. Introduction

To investigate the future of a system, we need a proper representation of that system. However, often we find it difficult to infer how the system functions as a whole. Mathematical modeling provides a framework for conceptualizing our ideas through some equations, which assist in developing new hypotheses for future testing [1]. It is used in various settings, including disease mechanism analysis, biomedical systems, and government policymaking. William Ogilvy Kermack and Anderson Gray McKendrick are the introducers of mathematical modeling into the field of epidemiology [2]. It immensely aids in the quantification of potential infectious disease control and mitigation techniques. It provides a crude general behavior of an epidemic as addressed by epidemic curves, allowing predictions about the epidemic’s endurance, magnitude in the population, and evaluation of components that influence transmission dynamics and thus the number of cases [3].

Mathematical modeling is playing an increasingly paramount role in providing quantitative insight into multiple fields [4]. It has contributed to a better realization of the mechanisms of various chronic diseases and infectious diseases [5]. For instance, currently, we can see extensive use of mathematical modeling in understanding the mysterious mechanisms of ongoing highly infectious disease COVID-19. It has also availed understanding of the mechanisms of various critical phenomena, such as wound healing, morphogenesis, and blood-cell production. It has received increasing attention because modeling and simulation allow rapid, cost-effective testing and formulation of novel hypotheses [6]. Investigation of the natural phenomena and various effects of climate change is another major field of mathematical modeling [7]. Due to globalization and the diversity of living objects, climate change has been a pivotal aspect of research over the years [810]. Mathematical modeling plays a pivotal role in earth science-related aspects in various branches of environmental sciences. It may enhance the solution to the catastrophic incidents and adversities of unplanned global biodiversity and reduce the threats of the severe pollution caused over the decades. Artificial intelligence is vital for treating unwanted climate changes with long-term natural effects [11]. Mathematical modeling is the fundamental base of modern computing science that enhances the incorporation of biological models with artificial intelligence [12]. So, mathematical modeling can be used to improve prognosis, management, and control strategies for diseases. Also, properly utilizing mathematical modeling in the environmental sciences, especially in the remedy of climate changes, proper time management, and optimization of several costs, can be ensured [13].

We are aiming for computer-based simulation through the Python platform, where Age of Information (AoI) plays a prominent role [14]. For the quantitative approaches to data prediction, real-time data is the pivot aspect. In forecasting for the future or data prediction methods, the data attained from past periods enhance the logical assumption of the next phenomena. The majority of the data prediction tools use the primary data that must be acquired from trustworthy sources in an impartial manner [15]. Data-driven prediction models are mostly applied for the forecasting of pandemic situations. Data collection, classification, model generation, and validity testing are very closely connected to the AoI [16]. In those sequences of activities, AoI justifies data modernity by investigating the requirement of the proposed model, which also ensures the usefulness and successive updates of the information over the required period [17]. In the present days, optimization of the data prediction accuracy is another prime concern of the AoI [18]. Data feasibility, adjustment to the proposed models, and analysis of the physical attributes greatly rely on AoI-based strategies [19]. For the betterment of the data collection, transmission, allocation, and implementation Internet of Things (IoT) act as the key supporting mechanism [20, 21]. Sensor-based artificial intelligence (AI) with remote access is a distinguished ingredient of IoT that makes the remote sensing data arrangement more feasible than in the past [22]. Machine learning (ML) technologies provide a wide range of facilities for data structure, similarity analysis, and prediction yielding maximum efficiency with minimum effort [23]. Presently, IoT is massively used for healthcare functions, basically for decision-making features. In the current COVID-19 spreading, the overall scenario of community transmission, deaths, recoveries, and medication information is accommodated through the various modes of IoT [24, 25].

Epidemiological modeling is done by integrating mathematical modeling into the system that works on a certain population, which can be divided into nonintersecting classes, such as susceptible (), infective (), and removed () [26, 27]. Infective classes of the population and then specify the behavior of casual agents in different compartments were analyzed over time [28, 29]. The simplest compartmental model is SIR, which is used for epidemiological modeling [30]. Since each disease is distinct, models must be adjusted depending on the epidemic’s parameters and components. SIS, SIR, SIRD, SEIR, SIQR, and SEIRD are some of the extended versions of SIR that researchers use for epidemiological modeling [3133]. Using epidemiological models, we can show how different public health interventions may affect the outcome of the epidemics [34, 35].

Epidemiological modeling is a great tool for estimating the future of a pandemic, and researchers always try to optimize these models. Great work to comprehend these models has been proposed by Herbert Hethcote, in which he presented overviews of different compartmental models and their theoretical characteristics [36]. Jesus Fernandez-Villaverde and Charles I. Jones have introduced the SIRD model to estimate standard epidemiological modeling of COVID-19 [37]. However, only death data from many countries around the world was used in their models. They have made certain additions, including inverting the SIRD model and introducing an additional recovery state for those who seem to be infectious for a couple of days but took longer to recover. Through the simulation, they tried to predict the possible outcomes of COVID-19. Saulo B. Bastos and Daniel O. Cajueiro have also used compartmental models such as SIR, SIRD, and SIRED to forecast the growth of COVID-19 in Brazil [38]. They proposed two variations of the SIR model and added a parameter to analyze the effect of social distancing. However, an optimal control method for models like SIR, SIRD, and SEIRD can be found in the work of Morton and Wickwire [39].

The main challenge in epidemiological modeling is finding accurate data on cases. So researchers have to make some assumptions for the sake of their work. Since many parameters get ignored this way, the accuracy level significantly fluctuates. Fernandez-Villaverde and Charles I. Jones had faced similar kinds of difficulties [37]. As they have only focused on death data, they came to a conclusion based on a relatively simple model. Then again, as the massive test campaign did not take place, the exact number of deaths is still unsure. Saulo B. Bastos, in his paper, has urged to test the population as there is a possibility that an asymptomatic individual too can be a carrier of the virus [38].

To fight against any disease, we need to know the exact number of people at high risk and provide them with the proper healthcare. However, it is hard to do in such overcrowded countries as Bangladesh, India, and Indonesia. As a result, a great number of people remain uncovered (those who are susceptible to the coronavirus but are not tested). Some components are accountable for this, for instance, insufficient funding for running a massive campaign, hurriedly collecting data, and lacking human resources to reach every corner of the country. Similarly, due to social and monetary issues, a significant number of apathetic people (those who are unwilling to take a test or maintain proper instructions after being exposed) intend to refrain from healthcare activities. Most apathetic people are illiterate and make their decision over some rumor. So they fear going through the testing procedure. Another main reason is that they cannot manage enough money on time to go to the hospital and receive treatment.

The SEIRD model, one of the epidemiological models that researchers and epidemiologists use to predict the future of a pandemic, comprises five branches, namely, susceptible (), exposed (), infected (), recovered (), and dead () [40]. Susceptible are those groups of people who are not infected but are at high risk of getting COVID-19. Those who are prone to the infection can be categorized as exposed. The infected category is for those who have tested COVID-19 positive. Finally, after these three states, people would either recover or die. The basic SEIRD model is shown in Figure 1.

We have added two more branches to the main SEIRD model in our work, namely, uncovered () and apathetic (), intending to estimate the future of COVID-19. This uncovered stage is a subcategory of susceptible, which means some people remain uncovered even if they are at high risk, and the apathetic people are included in the exposed state who does not receive the proper treatment. In this work, we have utilized real-time input-data-based simulation techniques incorporating Python programming with the mathematical induction approach. Our system is designed and trained to take the input data and provide the predicted data with significant accuracy. We have justified the proposed model by comparing the predicted data with the real-time data by the graphical manifestation. Error analysis of the data prediction is provided through the relative error expressed as the percentage, where the figurative interpretation is incorporated with the statistical analysis. Finally, some numerical assessment was done by the correlation and regression analysis. We expect this work will facilitate exploring the spreading characteristics of COVID-19, along with its future growth. It may help in making different government policies as well.

2. Materials and Methods

Presently, COVID-19 is one of the atrocious epidemics in the world. From a statistical point of view, the SEIRD model is being studied systematically. Too many researchers have proposed various models using this, as we mentioned before. Similarly, we have investigated the extensions of the basic SEIRD model in our previous work and introduced two extended models by adding the branches uncovered and apathetic, respectively [41]. The extended SEIRD models are exhibited below in subfigures of Figure 2, respectively.

List of the parameters with the corresponding symbols are given in Table 1.

Later on, it seemed that if the whole ingredients could be shown in a single model rather than separate models, the acceptability and adaptability of this updated SEIRD model would be preferable to before. Consequently, we are proposing an improved version of the SEIRD model, where we have merged the previous extensions.

2.1. Proposed Model

In the proposed model, we are incorporating the basic components of the SEIRD model with that of the unprivileged classes of the population, namely, the uncovered and apathetic. The improved proposed version of the extended SEIRD model would be able to illustrate the real-time scenario of the pandemic situation in Third World countries. The proposed extension of the SEIRD model is depicted in Figure 3.

2.2. Mathematical Formulation

The components connected to the physical phenomena incorporated with the logical affirmation enhance the composition of a mathematical model that can comprehensively manifest the attributes of the objective population. The following system of governing equations designs the formulation of the proposed improved epidemiological model.

Variations of the different components of the epidemiological models are contingent on the rate parameters of the governing equations. Equations (1) to (12) are expressing the successive increment in the desired components of the proposed model. In each equation, the target component is taken as the node, whereas incoming and outgoing components are taken as the positive and negative flows for the corresponding component. Rate parameters are taken as the weight for each incoming and outgoing component. The linear combinations of the total flows provide the desired increment of the target component.

To find the rate parameters of the proposed model, we have to solve the Equations (1) to (12). The simplified forms of those rate parameters are given below.

3. Results and Discussions

Validation of the proposed model is investigated in this section with practical pieces of evidence. We have applied the improved SEIRD model to analyze and forecast the COVID-19 circumstances in Bangladesh. We have observed the individuals of the cases for the components susceptible, exposed, uncovered, isolated, infected, and apathetic, along with the occurrences of deaths and recoveries. In this scheme of investigation, we have used the experimental data collected from some trustworthy sources and configured them according to the requirement of the proposed model.

3.1. Data Collection Scheme and Setting Arrangement

A well-grounded data set is a fundamental prerequisite for validating mathematical models. The present work scrutinizes a very sophisticated contemporary issue, namely, COVID-19. Thus, the selection of a real-time data set is a very crucial task. Attaining authentic data is always a cumbersome assignment. In the data collection process, we have inquired into several types of public data sources, for instance, health-care-related national and international news portals, and open-source data directories of government and nongovernment agencies. We have collected the COVID-19-related information from social sites as well. Moreover, due to their absence in well-established institutions, few data are collected locally from newspapers and news bulletins.

The assembled data set comprised the columns of the total number of susceptible, exposed, uncovered, isolated, infected, and apathetic cases with corresponding deaths and recoveries. Here, we can refer some reliable sources of practical data, such as DGHS Bangladesh, Corona Info, IEDCR, Worldometer, and WHO. The real-time data of 46 days from 16 June 2021 to 31 July 2021 are collected and considered in this research for predicting the data for desired upcoming days.

3.2. Methodology of the Data Prediction

The desired prediction will be estimated by the proposed model utilizing numerical simulations. The Python programming language will be assimilated for the simulation codes. To predict the future data, from the previous works, it has been seen that the SEIRD models can be trained for at most 7 days, and it has been extended to 15 days in [41]. Now, we aim to assign the proposed model for training over more than 15 days to predict the future data for longer. We employed the real-time data for 21 days from 11 July 2021 to 31 July 2021 and predicted the future data for 30 days from 1 August 2021 to 30 August 2021. According to the successive prediction process, the proposed model learns the data every 30 days, predicts the outcome of the next day, and does the same for the next days, which is a recurrence prediction strategy.

3.3. Figurative Comparison of the Predicted Data with the Real-Time Data

To justify the validity and accuracy of the proposed model, a figurative comparison between real-time data and the predicted data for the target components is considered. Predictions of the total number of cases are pointed out for each component, and the number of individual cases can be found by taking successive differences of two consecutive days.

Figures 4, 5, and 6 demonstrate the comparison of real-time data and predicted data of the total number of uncovered, infected, and apathetic cases along with the corresponding deaths and recoveries, respectively.

Figures 7 and 8 exhibit the comparison of the real-time and predicted data of the total number of exposed and isolated cases. The dates of the target period are depicted in x axis, whereas the number of respective cases is depicted in y axis. Scaling of y axis is distinctly taken so that the difference between real-time and predicted data could be significantly ascertained.

The subfigures of Figures 4, 5, and 6 illustrate admissible resemblance of the real-time and predicted data of uncovered, infected, and apathetic cases along with the corresponding deaths and recoveries, respectively. The prediction process was excellent for about 20 days, and then, the deviation started for a few cases. However, this is not surprising as the real-time data pattern was nonlinear, whereas in most of the cases, the pattern of predicted data lost the nonlinearity.

From Figures 7 and 8, it is conspicuous that both the real-time and predicted data for the exposed and isolated cases have nonlinearity. Still, the trend is not the same and has noticeable fluctuations. This happened due to the lack of trustworthiness of the collected data and peoples’ unwillingness to maintain public health regulations. Moreover, many of the exposed and isolated cases were not firmly recorded because of the initial infrastructural infelicity of the healthcare institutions.

Thus, from the figurative comparison, it can be claimed that the proposed model can be competently used to predict the future data for the epidemics for up to 20 days for most of the components and almost 30 days for some of the components.

It can be said that, if we reduce the number of prediction days, the fluctuations between the real-time data and predicted data for the target cases can be optimized. Also, if we can get more reliable open-source real-time data, the better predictions can be gained by utilizing the proposed model.

3.4. Error Analysis

Since we have scaled the graphs for exploring the variations of the real-time and predicted data distinguishable, the scales were not uniform for all of the components of the epidemiology. So, to clarify the variations properly, their descriptive analysis is essential. The relative errors for the target components expressed in percentage are appraised in the error analysis. Figure 9 displays the relative errors in percentage for all the components for 30 days from 1 August 2021 to 30 August 2021.

From Figure 9, it is obvious that for the first 10 days, no significant errors can be seen for all the components but in isolation. The errors for all components other than isolation are allowable for up to the next 10 days. For the final 10 days, the errors for , , and are comparatively significant, whereas the error for is remarkable. The unbounded error for isolation cases is due to the improper healthcare management and ignorance of the basic healthcare regulations of the mass people in practical, which are methodically neglected in the prediction process. To disclose the plenary scenario of the error analysis, a tabular exposure of the statistical ingredients of error analysis, including minimum value, maximum value, average, and standard deviation (SD) of the relative errors, is provided in Table 2.

So, from the graphical and tabular pieces of evidence, it is ascertained that almost all of the target components of the epidemiology proposed model can be suitably implemented.

3.5. Correlation and Regression Analysis

The number of deaths and recoveries are the most prominent components of the epidemics, and most epidemiological models are designed to predict them essentially. From the previous sections, it is already justified that the component’s deaths and recoveries can be properly predicted by implementing the proposed model. In this section, the correlation and regression analysis of the real-time and predicted data will be done for the deaths and recoveries of the uncovered, infected, and apathetic cases, respectively.

Table 3 displays the correlation coefficients of the deaths and recoveries of several cases’ real-time and predicted data. The table shows that the deaths and recoveries of those cases are perfectly correlated with a correlation coefficient of almost 1. However, according to the numerical evidence, predicted data shows a better correlation than real-time data.

Table 4 displays the regression coefficients of the deaths and recoveries of several cases’ real-time and predicted data. From the table, it is seen that for both the real-time and predicted data, the number of deaths is dominant over the number of recoveries. In all cases, the rate of the dominance of the number of deaths is a bit higher for the predicted than that of the real-time data.

It is crucial to predict the number of deaths and recoveries according to the cases arising from uncovered, infected, and apathetic people. The regression lines for predicting future data of deaths and recoveries depending on the current data of the uncovered, infected, and apathetic cases for both real-time and predicted data are given below.

So, by exerting the number of uncovered, infected, and apathetic cases in the above regression lines, the number of future deaths and recoveries corresponding to these cases can be suitably predicted.

4. Conclusion

In this work, an improved SEIRD model was derived and applied for the underprivileged people in the contemporary pandemic, a combination of two extensions of the basic SEIRD model derived in previous work. The uncovered and apathetic cases and the associated deaths and recoveries are constituted in this improved version of the SEIRD model. As the data-driven validation of the proposed model, we explored the pervasiveness of COVID-19 from the Bangladesh perspective. Implementing the improved SEIRD model, we have gained the appeasement outcomes in all components except the isolated cases. The reasons behind the infirmity of predicting isolated cases are conferred. From the figurative comparison and analysis of error, it is apparent that the improved SEIRD model can practically implant for the prediction of future data based on the real-time data. From the statistical overview delineated in this work, it can be said that the deaths and recoveries are perfectly correlated for all sorts of real-time data cases, and the predicted data follows the same trend. According to the regression analysis, it is evident that the number of deaths is expressly over the number of recoveries for both real-time and predicted data. The regression lines for predicting future deaths and recoveries according to the uncovered, infected, and apathetic cases are also formed. From both the real-time and predicted data, it is comprehensible that though the numbers of uncovered and apathetic cases are not negligible, in contrast to the infected cases, these cases are enough diminutive. Also, the number of recoveries appreciably outplayed the number of deaths on every occasion.

The number of uncovered cases increased due to the lack of skilled healthcare workers, misconceived authorities, and impractical infrastructure. Since the percentage of deaths is significantly more contemptible than the percentage of recoveries, which raised the apathetic cases among the uneducated and troublesome people. Socio-economic conditions and deficiency of epidemiological knowledge are also responsible for escalating the apathetic cases.

Recently, a novel epidemiological disease, Monkeypox, evolved and propagated in many countries. The next research will try implementing the improved SEIRD model for Monkeypox. A machine learning algorithm will be derived to support the improved SEIRD model, and it is expected that the machine learning approach will resolve the issue that occurred in isolated cases.

Data Availability

Data can be found at https://github.com/mu2mahmud/Improved-epidemiological-model/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.