1 Introduction

Over the last century, different types of pathogens have killed more people than any armed conflict (Adda 2016). COVID-19 is a disease caused by SARS-CoV-2, and it can infect people via direct and indirect human-to-human contact. The differences we observe in the infections and fatalities are due to built-in characteristics within the society that generate different exposure levels and different responses to the virus attack. All regions are not equally responsive to the diffusion and impact of the virus since most regions reflect different social, demographic, behavior, and economic characteristics and related responses. Therefore, although the attack of COVID-19 on human society was exogeneous, the response to the virus was completely endogenous to society’s established characteristics.

COVID-19 appears to be impacting people disproportionately across different socioeconomic groups and varies by ethnicity, income, age, gender, and social groups or communities. This is reflected in the relative concentration of these communities in different regions. However, it is not possible to simply and reliably put together the threads of evidence that clarify these discriminatory COVID-19 impacts. The evidence of differences linked to impact differentials needs to be better understood. What are the aggregated regional characters that are fundamental to these disparities in virus infections and fatalities? This paper explores these disproportionate impacts for COVID-19.

The poverty rate is an economic variable, and we evaluate how the poverty rate influences the rate of transmission and consequences for viral disease—here COVID-19. Earlier literature has investigated several economic variables and their role in health conditions related to economic activity in the context of infectious diseases (Adda 2016). These variables include income, GDP per capita, and unemployment rates (Adda et al. 2009; Ettner 1996; Ruhm 2000, 2003, 2005). Others have also looked at the impact of trade (imports and exports) and international exposure in a context of viruses and bacteria as they relate to human infections, including smallpox, whopping cough, scarlet fever, mumples, measles, polio, tuberculosis, and HIV among many others (Oster 2005).

In this paper, we test two main hypotheses fundamental to understanding the relationships between poverty and COVID-19. In theory, it is not clear whether poverty should worsen the transmission of infectious diseases, or help it contain the diseases. It is even more complex and obscure in the developed world context. If poverty prone areas have limited economic activities and that in turn limits human interaction, poverty may help to control infectious diseases like COVID-19. On the other hand, if poverty is related to physical interpersonal contacts, limited health care capacity, and certain interaction behaviors, it might dominate other poverty-related limited economic activity, which means we may experience higher infections in poverty prone areas. For example, financially poorer people are often essential workers that demand physical presence and are not permitted to work from home. The increase in physical interpersonal contacts and decline of compliance with social distancing in poverty prone areas would exacerbate the spread of the COVID-19 disease (Kim and Kwan 2021; Liu et al. 2020). Furthermore, economically disadvantaged neighborhoods may be clustered by houses with poor conditions and have less safety in outdoor spaces. Overcrowding in poverty prone areas will put people at higher risk of spreading the contagious disease (Rollston and Galea 2020). Moreover, poverty and public shame may shape how people experience being observed and judged by others during the pandemic. A strict disciplinary measure and shaming in any form (rather than harm-reduction strategy) can cause stigmatization and eventually undermine public health efforts by discouraging people, particularly poor people, from getting tested and disclosing their health status to contact-tracers or cooperating with other mitigation programs (Collins 2020; Gold 2020). Saxe (2020) presented how stigmatization and criminalization deter testing and exacerbate problems of abusive policing of members of marginalized communities in the HIV epidemic. Similarly, a higher poverty rate can be responsible for higher death and fatality rates (fatality rate is the ratio of COVID-19 infections to deaths, converted into percentage) because poor regions may have limited capacity to provide medical access and treatment to infected (COVID-19) patients. Additionally, economically disadvantaged people may not be able to afford health insurance and hesitate to get the COVID-19 test which in turn limit their access to health resources and treatments in poverty prone areas (Cordes and Castro 2020).

The majority of this literature covering the characteristics and consequences of the pandemicFootnote 1 can be grouped as follows: (a) the measurement of the spread of COVID19 and the role of mitigation such as masks, crowding limitation, and social distancing, (b) the degrees of disease transmission, plus the effectiveness and compliance with social distancing, (c) the economic impacts of COVID-19 such as the impact on employment, (d) the socioeconomic consequences of extreme measures such as shelter-in-place or lockdowns, and e) the governmental response to the pandemic (Brodeur et al. 2020).

Some literature attempts to relate COVID-19 to key socioeconomic variables as causes for the difference in impacts. Most of these have yet to be peer-reviewed (Chin et al. 2020; Li et al. 2020). Other studies focus on spatial analysis (Sun et al. 2020), ethnic disparities (Li et al. 2020), development of online dashboards for tracking COVID-19 (Wissel et al. 2020), use of vulnerability indices (Mukherji 2020), and issues in equitable COVID-19 response (Chin et al. 2020). The major concern of these studies is that they do not adequately address the confounding factors. For example, counties in each state were subject to various COVID-19 mitigation policies. Therefore, simple county-level cross section analysis often suffers from selection biases by not controlling for state mitigation policies. This is the case for most pre-print unpublished articles. We address these problems using multi-level data with state fixed effects which control all COVID-19 policies and other factors across the states. We observe a significant difference in results with and without state fixed effect controls. We also estimate a multi-level mixed method for robustness checks which is a suitable and appropriate strategy given in our research design and data set. In this paper, we hypothesize that poverty worsens the COVID-19 impact, and we test it using US county-level data. However, this expectation is not obvious or not agreed upon.

The next section states the hypotheses about the relationship between COVID-19 and poverty. Section 3 reports methods, data, and descriptive statistics. Section 4 presents the results of cumulative COVID-19 data until the end of July 2020. The next section, Sect. 5, presents a robustness check by disaggregating results by monthly cumulative cases and deaths and by estimating a multi-level mixed method. Section 6 concludes the paper.

2 Poverty and COVID-19

Earlier studies on similar infectious diseases demonstrated that poverty, poor health, and poor sanitary conditions make such health crises worse. This is reflected in studies in most developing countries including most recently Campos et al. (2018) for the Zika virus and Redding et al. (2019) for the Ebola virus. The Infectious Disease Vulnerability Index for countries in Africa developed by Moore et al. (2017) using Ebola to inform actions for preparedness and response to infectious disease outbreaks was dependent on national health systems worldwide. Such studies are rare for developed economies but are still quite relevant even for developed economies and are particularly important when it comes to COVID-19 in the USA. Except for Adda (2016) who offers an extensive analysis of the transmission of three viruses—influenza, gastroenteritis, and chickenpox using data across a century of national health systems in France; these national data systems have not been as widely used as they should have been. One of the questions that Adda (2016) asked was whether the virus spread more rapidly during periods of economic growth and if the spread followed a “gradient determined by economic factors.” He found that the viruses studied propagated faster during times of economic boom due to increased economic activity and contact between people. Qiu et al. (2020) conducted a similar analysis for Wuhan, China, and also had a positive relationship between the spread of the virus and economic activity. These studies imply economic expansion, not poverty, helps spread of disease. Hence, a decrease in poverty rather than its increase may be related to growth in the disease.

When it comes to relating infectious diseases such as COVID-19 to poverty, the relationship is not straightforward. Whether poverty will precipitate the spread of the virus or will limit its spread may depend more on the micro-level characteristics that result from poverty. For example, if we assume that poorer regions have limited economic activities, they should have lower diffusion rates of infectious disease and lower aggregated deaths. Adda (2016) noted that higher economic activities cause the spread of infectious diseases while poverty-prone areas may have limited economic activities that contain the virus because of less mobility and traveling, which in turn reduces interpersonal contacts and reduces the diffusion (spread) of the diseases.

On the other hand, alternative evidence shows that economic downturns precipitate the transmission of infectious diseases due to limiting the capacity to control the disease (Suhrcke et al. 2011). If poverty-prone areas have more work that requires physical presence and person-to-person contact, this can increase the spread of the virus. Moreover, poverty makes people reluctant to take sick leave from work, fearing unemployment, while increasing the risk of disease transmission at the workplace (Barmby and Larguem 2009). During the 1990s, countries of the former Soviet Union (FSU) and Eastern Europe experienced a devastating economic crisis, as GDP fell by one-third on average, which markedly increased the incidence, prevalence, and mortality of tuberculosis, and worsening treatment. This, in turn, led to the emergence of drug-resistant strains (Shilova and Dye 2001). Similar experiences have also been recorded for HIV prevalence, outbreaks of diphtheria (Markina et al. 2000), tick-borne encephalitis (Randolph 2008), and leptospirosis (Stoilova and Popivanova 1999).

Related literature suggests that infectious diseases disproportionately affect vulnerable groups. In a review of the European literature, this effect could be found in every single EU Member State (Semenza and Giesecke 2008). A separate study comparing wealth distribution and tuberculosis (TB) rates across the EU Member States demonstrated a strong correlation between income equality and lower TB rates (Suk et al. 2009).

Thus, it is unclear which dominates the overall impact of higher poverty and the spread of the virus. If limited economic activities resulting from poverty dominate, the impact we will see is the lowering of the spread of the COVID-19 in poverty-concentrated areas. However, if it is the latter, where poverty drives limited health facility capacity, hypermobility, interactive economic behavior, and job characteristics and it is dominant, then we should see a higher level of infections in poverty concentrated areas. While we test this hypothesis (H1) in the US context at the county level, we expect to see higher infections in poorer counties since we believe the latter scenario dominates in the USA. This is especially true when it is widely viewed that poverty is related to hypermobility (Schafft 2006).

H1

Higher poverty concentrated regions will have higher COVID-19 infections

Regarding deaths, the association is even more predictable. Poverty-prone areas have limited health sector capacity and higher budget constraints. A pandemic like COVID-19 can easily disrupt the limited health and treatment systems and infected people may die more in regions with higher poverty than in regions of less poverty. Moreover, poverty is expected to be associated with limited to no health insurance coverage, and if people cannot afford to cover the cost of treatment (out of pocket payment), the consequences can be dire.Footnote 2 Therefore, people will seek less health treatment and have less access to health services leading to higher deaths in poverty concentrated areas. We test the second hypothesis (H2) to understand the relationship between poverty and COVID-19-related deaths and fatalities.

H2

Higher poverty concentrated regions will have higher COVID-19 deaths and fatalities.

3 Methods and data

3.1 Methods

The unexpected incidence of the pandemic created an opportunity for a quasi-natural experiment over time. We use the pre-pandemic poverty rate to see the relations between poverty and COVID-19 infectious disease which rule out the possibility of reverse causality and selection bias. We estimate the following Eq. (1) to identify the consequence of poverty on the COVID-19 pandemic.

$$ Y_{{{\text{cs}}}} = \beta_{0} + \beta_{1} {\text{Poverty}}_{{{\text{cs}}}} + \beta_{2} {\text{X}}_{{{\text{cs}}}} + \beta_{3} \mu_{s} + \varepsilon_{{{\text{cs}}}} $$
(1)

where \({Y}_{\mathrm{cs}}\) is the cumulative COVID-19 cases, or deaths, or fatality rate in county c in the state s, and \({\mathrm{Poverty}}_{\mathrm{cs}}\) is the poverty rate at the county-level in the state s. \({X}_{\mathrm{cs}}\) is the county-level covariates and \({\varepsilon }_{\mathrm{cs}}\) is the error term. The \({\mu }_{s}\) is the state fixed effect, which captures the state-level observable and unobservable factors that are specific to a state. Since the pandemic started, many states initiated various COVID-19 mitigation policies such as stay-at-home order, mandating masks, closing many businesses and schools, and others that are expected to have an important role on both the infections and death rates. Moreover, besides direct mitigation policies, other spatial factors differ across states including summer school holidays, regional time zones that characterize regions differently, and variation in regional transport systems—public and private. These are all potential confounding components impacting the spread of infectious disease (Adda 2016; Siddique 2021). State fixed effect should control all these variations not captured elsewhere as they impact COVID-19 diffusion. In this fixed effect approach, we are only considering heterogeneity within states and abandoning the heterogeneity across the states that are time-invariant, but we are considering heterogeneity both within and across counties. Our primary interest is \({\beta }_{1}\), which is the estimator for the average effect of the poverty rate on COVID-19 outcomes.

3.2 Data

We conducted this analysis using county-level data. Our COVID-19 data are the cumulative confirmed cases and deaths per million population. We also estimated the impact of poverty on the COVID-19 fatality rate. The COVID-19 data cover the period from late January to July 28, 2020 and are collected from www.usafacts.org, which provides an identical data set as The Johns Hopkins University COVID-19 dashboard data set. We acknowledge that the measures of COVID-19 data may suffer from undercounting problems due to limited testing, particularly during the early pandemic.Footnote 3 However, we believe this will not be a serious problem for testing our models for two reasons. Firstly, undercounting is likely to be more prevalent in poorer counties with limited capacity and stigmatization of poorer people. Since we are expecting a positive association between the COVID-19 pandemic and poverty, if there is real undercounting, it will underestimate the impact of poverty relative to the true size of the coefficient, not overestimate it. Secondly, we believe undercounting is more likely to happen early in the pandemic than later, so we disaggregated the cumulative data by months and check the robustness of our results that might be driven by potential errors in COVID-19 case and death counts.

Data for all independent variables are for the year 2019. We do this to avoid reverse causality, i.e., the impact of COVID-19 on independent variables such as poverty rate (Siddique 2016). For example, recently, Lima et al. (2020) reported that Brazil, which has suffered one of the world’s worst pandemic tolls, has responded to the crisis by distributing a large amount of cash directly to citizens that poverty and inequality have approached national historic lows. Therefore, we applied a one-year lag value to all independent variables so that there is no reverse impact on right-hand side variables in Eq. (1). At this writing, it should also be noted that 2020 data are not yet available for most variables. Poverty data are collected from the U.S. Census Bureau, Small Area Income, and Poverty Estimates (SAIPE) for the year 2018. These are the latest data available at the county level. Income inequality data are also collected from the SAIPE for the year 2018.

Most other data are collected from the County Health Rankings & Roadmaps (CHR&R) program, which is derived from a collaboration between the Robert Wood Johnson Foundation and the University of Wisconsin Population Health Institute. To avoid multicollinearity and too many control variables in the models, we created three composite health indexes, health behavior, clinical care, and the physical environment. We then calculated z-scores to rank the counties. The health behavior ranking is composed of tobacco use, diet and exercise, alcohol and drug use, and sexual activity. Clinical care is composed of access to care and quality of care. Ranking of the physical environment includes air and water quality and housing and transit. To rank them, we followed guidance from the CHR&R.Footnote 4 Most of these data have been compiled from the American Community Survey (ACS), USDA Food Environment Atlas, National Center for Health Statistics, Behavioral Risk Factor Surveillance System, Small Area Health Insurance Estimates, Centers for Medicare and Medicaid Services (CMS), National Provider Identification and Environmental Justice Screening and Mapping Tool (EJSCREEN).

We also controlled for many economic and demographic variables that can potentially impact both the poverty rate and the COVID-19 pandemic. While economic variables are collected from ACS, the Bureau of Labor Statistics, and SAIPE, the demographic variables are from the Census Bureau’s Population Estimates.

3.3 Descriptive statistics

Table 1 presents the descriptive statistics of all variables used in the models. The mean infection and death per million population are 9077.99 and 224.43, respectively, while the mean fatality rate is 2.22. The mean poverty rate is 15.13. Figures 1, 2, 3 present the maps of the poverty rate and COVID-19 infections, deaths, and fatality rates. In the maps, the polygon layer shows the poverty rate. Color codes range from dark blue to yellow. The yellow color represents the highest poverty rate category. The 3D cones are for infections and deaths per million population and fatality rate in all three figures. These maps clearly show that more 3D cones with higher height are located in the higher poverty counties. The simple correlation test also shows that the correlation between poverty rate and infection per million population is 0.31 and between poverty rate and death per million population is 0.20. The next section, following the Figures and Descriptive Statistics, provides the output from the multivariate regression with state fixed effects and test our hypotheses.

Table 1 Descriptive statistics
Fig. 1
figure 1

Poverty rate and COVID-19 infections (poverty in color and infection in bar height). 3D cones represent infection data (higher the height higher the infections) and color codes represent the poverty rate at the county level

Fig. 2
figure 2

Poverty rate and COVID-19 deaths. 3D cones represent the death (higher the height higher the deaths) and color codes represent the poverty rate at the county level

Fig. 3
figure 3

Poverty rate and COVID-19 fatalities. 3D cones represent the fatality rate (higher the height higher the fatality rate) and color codes represent the poverty rate at the county level

4 Results analysis

Table 2 presents the results from an ordinary least square (OLS) estimate with state fixed effects using cumulative COVID-19 data from January 22 to the end of July 2020. Panel A to C in Fig. 4 presents the marginal effect of poverty on the COVID-19 infections, deaths, and fatalities, respectively. Results confirm our hypothesis that poverty worsened the COVID-19 impact both in terms of infections and deaths. This means poverty-related elements linked to health facility access and capacity, hypermobility, and direct/face to face/interactive jobs dominate over other remote economic activities in the USA while impacting the COVID-19 pandemic. We also estimate the impact of poverty on the fatality rate and we see that the higher the county poverty level, the greater the overall COVID-19 fatality rates. This also confirms our other hypothesis that poverty-driven limits on treating COVID-19 patients cause higher deaths in the USA. Note that, we did not test the direct effect of limited health facility access and capacity, hypermobility, and job characteristics (interactive behavior) on the infections and deaths and we do not claim that as a causal outcome. However, this might be a fair interpretation of the positive association between poverty and COVID-19 infections and deaths given the possible alternative scenarios.

Table 2 Regression results on COVID-19 confirmed case, death, and fatality rate until the end of July
Fig. 4
figure 4

Marginal effect of poverty rate on COVID-19. Panel A to C presents the marginal effect of poverty on the COVID-19 infections, deaths, and fatalities based on the models 2.4–2.6, respectively. Panel D to F presents the marginal effect of poverty on the COVID-19 infections, deaths, and fatalities by share of Black population based on the models 2.4–2.6, respectively. All these estimations have been adjusted for all covariates included in Table 2. They all show a positive impact of poverty on COVID-19 and it gets worse for counties with higher share of Black population. Figure 4 only shows the impact of poverty by Black population; however, they are equally true for other minorities like Asian and Hispanic

Models 2.1 to 2.3 include some basic controls such as income inequality, the ranking of health behavior, clinical care, and physical environment, percentage of the rural population, the log of median household income, gender, population percentage over sixty-five, and education. Models 2.4 to 2.6 add more controls for ethnicity, unemployment, and the local population’s commuting habit.Footnote 5

Model 2.1 reports the regression results on confirmed COVID-19 cases. It shows that a 1% increase in the poverty rate is correlated with an increase in COVID-19 confirmed cases by 403.75 per million population for a US county. This is an important and considerable number linked to the poverty rate. The impact is statistically significant at the 1% level. Poverty is also associated with a higher number of deaths per million population. Model 2.2 reports that a 1% increase in the poverty rate is related to 17.81 more deaths per million population for a county. We also estimate the impact of poverty on the fatality rate ((death per million/case per million) * 100) and find a statistically significant impact. Results in model 2.3 show that a 1% increase in poverty is associated with an increase in the fatality of 0.07, which is again statistically significant. This provides another piece of evidence suggesting that poverty is a key link to the consequences of this infectious disease.

After controlling for ethnicity, unemployment rate, and commuting behavior, the impact of poverty remains statistically significant although the magnitude of the impact drops by almost 50% for all infections, deaths, and fatality rates. This is understandable since the poverty rate is higher among minority groups which is well documented (Massey, Gross, and Shibuya 1994). It is particularly high for the black population. Panel D to F in Fig. 4 presents the marginal effect of poverty on COVID-19 infections, deaths, and fatalities by share of black population. The black population is more likely to be infected, and then to die from the virus relative to the population at large. Overall fatality rates are also high among the black population. The coefficient of the black population in models 2.4–2.6 indicates that a 1% increase in the black population is associated with additional 201 infection cases and 8.96 deaths per million and a 0.02 increase in the fatality rate.

The Hispanic population is more likely to be infected and die although their fatality rate is not statistically significant. A 1% increase in the Hispanic population is associated with 227 more infections and 3.2 deaths per million as shown in models 2.4 and 2.5. Their insignificant fatality rate suggests that the Hispanic population was more able to get medical treatment to save their lives if they are infected. Another reason that Hispanics could die less frequently when they are infected is health. They may die less frequently because more Hispanics do not have underlying health conditions and their population is younger than other races. However, their higher infections and deaths per million indicate that they live in an economic environment that requires more human to human interactions.

Asian minorities are also more likely to be infected. The fatality rate is also positively associated with Asian minorities. The coefficients for the Asian population indicate that a 1% increase in the Asian population in a county is associated with 195 more infections and 10.49 more deaths per million population resulting in a 0.03 higher fatality rate. Interestingly, the marginal effect on death and fatality rate for Asian is higher than that for the Black and Hispanic populations. However, the average, income and wealth for Asians are higher than the other minority groups and they are believed to have access to better health care services. On the other hand, it is important to note that the Asian population is quite diverse, and the “average” number is driven by some outliers of very rich Asian groups. In contrast, some Asians have the lowest income and wealth of all minorities such as the recently immigrated Myanmar or Burmese Asians. As we can see the coefficient of Asian is less statistically significant than Black which means the effect is noisier reflecting their diversity. However, here, it is hard to make any decisive conclusion without further evidence.

If we look back to the coefficient of poverty after controlling for ethnic minority populations in models 2.4–2.6 and make a comparison, we see that the coefficient of poverty drops by almost 50% (drops from 403 to 207 infection, 17.81–8.46 deaths per million and 0.07–0.05 fatality rate). This suggests that the minority populations have absorbed part of the impact of the poverty rate on the COVID-19 catastrophe due to their intercorrelation with poverty.

Other control variables also reveal interesting results. The population share that is sixty-five plus has a lower infection rate, but higher deaths per million and hence higher fatality levels. This is expected because the retired and elderly have lower engagement with outside employment which is one of the main sources of infection. However, due to their poorer immunity levels, the infection often leads to death. Counties with a higher share of the population living in rural areas have lower infections and lower deaths per million and therefore a lower fatality rate. On the other hand, log of median income is positive and significant which seems not to support our argument that poverty worsens infectious disease. Note that, higher median income does not represent lower poverty in some counties. This further suggests that something else such as an industrialization of a county and its related urbanization is likely driving higher median income than in poor non-industrialized counties. For example, the national average median household income was $51,090, and the national average poverty rate was 15.13% (Table 1). In Suffolk County (including Boston city), the median household income was $65,999, but the poverty rate was 17.5%. Hence, median household income is not necessary to be negatively correlated with the poverty rate. Similarly, share of the population with some college degrees is negatively associated with infections and deaths per million but not with fatality rate. Among the health ranking variables, clinical care, which consists of access to care and quality of care, is more consistently significant than other health rankings. The interpretation will be worst rank in health clinical care is associated with higher infections and deaths per million population. Since a lower score represents a better rank in clinical care and value ranges from -0.48 to 0.41, one unit increase in the score means moving from best in the ranking to the worst in the ranking, leading to a rise in 6 to 15 thousand additional infections and up to 2 hundred to 4 hundred additional deaths. On the other hand, worst ranking of health behavior is associated with less death although infections per million and fatality rate are not statistically significant in the full model in columns (2.4) and (2.6). The coefficient of health behavior ranking is difficult to interpret since it is likely confounded by some other behavioral factors such as better healthy behavior may be correlated with more social engagement which is not favorable in the case of infectious diseases, with higher age groups who have inferior immune systems, and with lower motivation to seek medical care. All these can lead to more deaths, however, confirming such conclusion is beyond the scope of this paper and needs more relevant measures for such behavioral factors.

5 Robustness checks

5.1 Data disaggregation to check impact overtime

Throughout the pandemic, we observed a large variation in coronavirus surges across regions. Early in the pandemic, the northeastern region of the USA experienced high impact levels, while later, southern regions experienced high impact levels (Haynes et al. 2020; Haynes and Kulkarni 2021). These differences in temporal disease diffusion generated some questions concerning the role of poverty as a significant determinant of COVID-19 infections and deaths. The concern is whether our results hold for both earlier and later periods in the pandemic? To account for the variations in COVID-19 cases and deaths across regions and over time, we conducted a robustness check by disaggregating the analysis and dividing cumulative COVID-19 data into four monthly intervals: April, May, June, and July. Note that, results on COVID-19 cumulative data till the end of July are presented in Table 2. Table 3 presents three separate analyses for cumulative data for April, May, and June for both infections and deaths. We find that our results hold for both early and later in the pandemic. The impact of poverty on COVID-19 infections and deaths were always significant. Figure 5 plots the coefficient of the poverty rate for COVID-19 infections and death per million population. It reveals that an increase in the poverty rate by 1% resulted in an increase of 121.30 infections and 7.99 deaths per million population in April. This coefficient increased to 403.75 infections and 17.81 deaths per million population in July. The size of the coefficient of poverty for May and June remains within the lowest and highest level, meaning the increased amount of cumulative death and infections per million that are linked to a higher level of poverty and this keeps rising over time. These results provide important evidence that poverty is a strong continuing correlate for an infectious disease such as COVID-19. Note that, in these disaggregated results, the unemployment rate shows a significant negative relationship to both infections and death except for infection for July. The impact of the unemployment rate supports the argument posed by Adda (2016). Unemployment is a marker of economic cycles, which reduces human to human interaction when it is high. Alternatively, it also captures additional effects such as different socialization patterns for those out of the labor force. If unemployment encourages people to engage in more socialization without any precaution/mitigation measures, it can increase infections and thereby more death. We do not see the evidence for this alternative interpretation but the possibility of less infection and death when they are not in the workplace is real. This is in line with our original argument that hypermobility or physical person to person interaction is the key sources of infections.

Table 3 Disaggregated results by monthly COVID-19 confirmed cases and deaths per million population
Fig. 5
figure 5

Plotting the coefficients of poverty rate for infections and deaths

5.2 Applying multi-level mixed model estimation

Now we apply a more complicated model for the same hypothesis tests as part of our robustness checks. Our models incorporate two levels of factors—level 1—the county-level factors and—level 2—the state-level factors which practice a spectrum of policies and operate under the unique political and economic system. We take advantage of this opportunity to examine the impact of factors at various levels on the COVID-19 pandemic. This is a hierarchical modeling environment where the combining process occurs at level 1; however, both level 1 and level 2 influence this combining process (Charnes et al. 1975; Otani et al. 2019). The utility of using hierarchical modeling is relevant since what is happening at the county level may not be completely independent and a multi-level model can account for both lower and higher levels as distinct levels but simultaneously and counties are clustered within the state.Footnote 6 We replicate the estimates presented in Table 2 by estimating Eq. (2) for a multi-level mixed effect (MLME) model. Here, \({Poverty}_{\mathrm{cs}}\) is the county level poverty within a state and \({X}_{\mathrm{cs}}\) is the vector of county level characteristics. In \({\mu }_{s}\), we gather state level factors. \({U}_{\mathrm{cs}}\) is a random effect in the model that accounts for specific variations in COVID-19 within a state, and \({U}_{c}\) operates at the state level to account for variations specific to each state. The \({\epsilon }_{\mathrm{cs}}\) is the idiosyncratic residual that captures anything that is not in the model.

$$ Y_{{{\text{cs}}}} = \beta_{0} + \beta_{1} {\text{Poverty}}_{{{\text{cs}}}} + \beta_{2} X_{{{\text{cs}}}} + \beta_{3} \mu_{s} + U_{{{\text{cs}}}} + U_{s} + \in_{{{\text{cs}}}} $$
(2)

We present this MLME result in Table 4. We see that the impact of poverty on both COVID-19 infections and deaths is larger than what we have seen in the case of state fixed effect estimation (i.e., coefficient of poverty rate in model 2.1 is 403.75 versus coefficient of 421.88 in model 4.1 for the same specification). So, the main results presented in Table 2 are even more conservative than this alternative MLME. The MLME provides us additional information that it has two intercepts: The variance due to the heterogeneity across states is \({U}_{s}/{\epsilon }_{\mathrm{cs}}\) = 2.13/(2.13 + (− 25.57) = 49.93, which about 50% of the variance is due to the variations at the state level (column 4.6) and the overall constant identical to a constant from non-MLME model.

Table 4 Multi-level mixed model estimates on COVID-19 confirmed case, death, and fatality rate until the end of July

5.3 Applying spatial autoregressive model estimation

Public health scholars have called for attention on COVID-19 infection patterns between spatially contiguous entities or interconnections between counties. As a result, most state governments imposed lockdown policies to reduce the frequency of human interactions for the purpose of curbing the spread of the COVID-19 disease. Some local governments even implemented stricter restrictions to constrain human mobility (Goolsbee et al. 2020). That is, spatial dependences between COVID-19 infections and counties may impact our proposed models in Table 2. To assess the spatial dependence effect of adjacent counties, we apply the spatial lag model, the spatial error model, and the spatial autoregressive combined model (SAC) to examine whether the effect of the poverty rate on COVID-19 infections remains robust across all models (Guliyev 2020; Sun et al. 2020; Narayanan et al. 2020). The spatial lag model explores how the COVID-19 infections in a county are influenced by the COVID-19 infections in neighboring counties. The spatial lag parameter reflects the association between the average COVID-19 inflection of neighboring counties and the COVID-19 inflections of a given county. The spatial error model examines how the model residuals of a given county are associated with that in its neighboring counties’ average model residuals. The SAC model is a simultaneous estimate of the spatial lag and spatial error models. Table 5 presents OLS and the three spatial models without ethnicity, and Table 6 presents those models with ethnicity. The results in Table 5 reveal that three spatial models have better model fits (i.e., AIC and BIC) than the OLS model. The OLS also overestimates the coefficient of the poverty rate in comparison with the three spatial models. When we controlled for ethnicity in Table 6, the results illustrate that the three spatial models still have better model performance than the OLS model. However, the OLS model shows more conservative estimate of the coefficient of the poverty rate than the spatial error model and the SAC model. In short, spatial models may perform better model fits than the OLS model. However, we see that the significant positive effect of the poverty rate on COVID-19 is confirmed in all cases and the results are robust across all models.

Table 5 OLS, spatial lag, spatial error, and SAC models estimate on COVID-19 confirmed cases until the end of July (without ethnicity)
Table 6 OLS, spatial lag, spatial error, and SAC models estimate on COVID− 19 confirmed cases until the end of July (with ethnicity)

6 Conclusion

This paper studies the impact of the poverty rate on the spread of COVID-19 viruses and related deaths caused by the virus across space and time using the best available data as of July 2020. Poverty leading to more physical interaction, incapacity to provide access to health care treatment, and job-related public behavior highly correlated with more infections and deaths in the U.S. Poverty related to limited economic activities is not the dominant factor leading to the containment of this infectious disease. Although we did not measure these poverty-related factors—hypermobility versus limited economic activity directly, we consider that these are the two main possibilities that can be driven by poverty in the context of infectious disease. This is an important contribution in the literature that provides systematic evidence in the developed world which is analogous to the evidence found in the developing world. Therefore, the local government and the Federal Government must be prepared to minimize the disproportional costs to poor people and minorities. Solutions are not only in creating infrastructure to provide easy access to health care services to all but also better coordination among housing, social welfare programs, food access, and other social protections so that public health responses are immediately effective and there are enough incentives for low-income people to maintain physical distancing and adjust job-related public behavior.

We must acknowledge some limitations and we should keep in mind while we explore these results. To detect a causal relationship between poverty and COVID-19, it would be better if we could have panel data where we would control for the time fixed effect to avoid any role of unusual time trend issues (i.e., spike in unemployment rate for business closures) and county fixed effect to control for both observable and unobservable time-invariant factors that can cause differences in impacts of COVID-19. Although our multi-level data allowed us to control for state fixed effects and, to minimize the suspicion of “wave effect” related to the timing of the pandemic, we also conducted the robustness analysis. We did this by disaggregating COVID-19 data by month, deploying multi-level and spatial autoregressive model estimation but it is not equivalent to a panel data analysis. We also did not have county-level COVID-19 hospitalization data, which is an alternative measure of infectious disease. It would also be better if we had infection and death data by age, gender, and ethnic groups. Future research should integrate hospitalization and more closely disaggregated COVID-19 data and hence causal control, and this would provide potential reinforcement of our interpretations. Future analysis should also focus on more micro-level information that discerns more about incapacity to provide adequate health care measure and jobs-related exposure behavior that together are correlated with poverty. This could be done by linking employment information, economic sectors by county such as food processing, meat slaughter, and packaging plants, location of homes for the aged, or separate age effects from poverty since they are likely highly correlated. It would appear from our analysis that the public mobility data between counties applied to poverty regions might also be of value.