Main

COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), presents a continuing threat to both global health and the global economy. By early March 2021, more than 119 million cases worldwide had been reported, with more than 2.6 million deaths1. Despite the implementation of unprecedented public health interventions, including social distancing, contact tracing and large-scale lockdowns of the population2, the burden of the disease has continued to rise but with substantial variation among countries and regions and with countries in many regions around the world experiencing multiple waves1. As of 14 March 2021, the WHO African Region had experienced two waves of infection and had reported a total of over 2.9 million cases of infection and more than 74,000 deaths1,3. A third wave is currently in progress.

Gaining an understanding of variation in the progression of the pandemic in different countries will aid the response to future pandemics. Current evidence from high- and middle-income countries suggests that demographics (for example, percentage of the population aged 65 years or older), comorbidities, healthcare resources and stringency of response are important risk factors for COVID-19-related infections2,4,5,6. It was suggested that Africa would be more susceptible to SARS-CoV-2-related cases and deaths given the higher prevalence of pre-existing conditions, including tuberculosis, malaria, AIDS, diabetes, undernourishment and other communicable and non-communicable comorbidities, as well as lower accessibility to healthcare7,8. Recent work suggests that spatial connectivity might also have an important influence on the course of the pandemic in Africa9. Using the data for COVID-19 cases and deaths from the WHO COVID-19 Dashboard, this study aimed to identify predictors of the timing of the first case and the per capita mortality rate in the first and second COVID-19 pandemic waves in the WHO African Region and to test for any effect of intervention measures on COVID-19-related deaths. We included, as predictors, existing indices of epidemic preparedness—COVID-19 readiness status and the more generic infectious diseases resilience index (Supplementary Table 1)—to test the expectation that countries rated as better prepared would suffer less severe outcomes. The main findings and limitations of the study are summarized in Table 1.

Table 1 Policy summary

Results

COVID-19 epidemics in countries of the WHO African Region

On 25 February 2020, Algeria was the first country in the WHO African Region to report COVID-19 cases (Fig. 1a). Thirty-one countries reported their first cases in the 2 weeks from 12 March to 26 March 2020. Lesotho was the last of the 47 countries to report its first case, on 14 May 2020. There was no apparent relationship between the timing of the first COVID-19 case and the first death (Fig. 1a).

Fig. 1: COVID-19 pandemic in the WHO African Region.
figure 1

a, Timeline of the first case and first death. b, Pandemic curve for daily new deaths. Map of per capita mortality rates in the first wave (c) and in the second wave (d). Tanzania, Burundi, Eritrea and Seychelles were excluded (Methods) and are shown in gray in c and d.

The 47 Member States reported a total of 29,635 COVID-19 deaths in the first wave and 44,850 deaths in the second wave. However, Tanzania discontinued reporting of COVID-19-related deaths from 8 May 2020, and Burundi, Eritrea and Seychelles were outliers (0.009, 0 and 0 per 100,000 population first wave mortality rates, respectively). São Tomé and Príncipe, as well as Seychelles, had missing data on the prevalence of HIV. These five countries were, therefore, excluded from the mortality rate analyses, giving a sample size of 42. Daily new deaths in the whole WHO African Region peaked on 5 August 2020 in the first wave and on 18 January 2021 in the second wave (Fig. 1b), lagging 16 and 7 d behind the peak of daily new cases in the first and second waves, respectively. The WHO African Region as a whole experienced a higher second wave peak than the first wave: 323 deaths (on 5 August 2020) and 675 (on 18 January 2021), respectively. In the first wave, the highest mortality per 100,000 population was reported from South Africa (33.3), followed by Cape Verde (17.5) and Eswatini (8.6) (Fig. 1c). In the second wave, the highest mortality per 100,000 population was also reported from South Africa (55.4), followed by Eswatini (39.8) and Botswana (17.7) (Fig. 1d). Twenty countries had higher or similar mortality rates in the second wave than in the first wave, whereas 23 countries had lower mortality rates in the second wave than in the first wave (Fig. 2).

Fig. 2: Scatter plot of per capita mortality in the first and second waves.
figure 2

Axes on log10 scale with points falling on the axes denoting zero deaths. The dashed line indicates identical levels of mortality rates in two waves. Tanzania, Burundi, Eritrea and Seychelles were not shown due to incomplete data/being outliers. Note that São Tomé and Principe was not included in mortality rate analyses due to missing predictor data. DRC, Democratic Republic of the Congo.

Predictors of the timing of the first case

We included 47 countries and 15 predictors (Supplementary Fig. 1a–k,p–s) in the Cox regression model for timing of the first case. Spearman’s correlation identified five pairs of predictors with correlation coefficients greater than 0.6 (Extended Data Fig. 1). The univariable Cox regression model identified total population size, number of international airports, volume of international air travel, COVID-19 test capacity and COVID-19 readiness status as risk factors for earlier detection of the first case and current health expenditure (percent of GDP) as protective factors (Fig. 3 and Supplementary Table 2). In the multivariable model, the percentage of urban population (hazard ratio (HR) = 1.40, 95% confidence interval (CI) 1.01–1.95), number of international airports (HR = 1.48, 95% CI 1.02–2.14), volume of international air travel (HR = 1.52, 95% CI 1.10–2.11), COVID-19 test capacity (HR = 3.86, 95% CI 1.83–8.15) and number of borders (HR = 2.87, 95% CI 1.12–7.32) were identified as risk factors for earlier detection of the first case (Fig. 3 and Supplementary Table 2).

Fig. 3: HRs and 95% CIs for predictors of timing of the first case in univariable and multivariable Cox regression model.
figure 3

n = 47 countries. Error bars are shown. Statistically significant risk factors are in red; protective factors are in blue. Exact two-sided P values for the Wald test are shown for each predictor, and two-sided P values < 0.05 were considered statistically significant.

Predictors of per capita mortality during the first wave

We included 42 countries and 18 predictors (Supplementary Fig. 1b–s) in the generalized linear mixed models (GLMMs) for per capita mortality in the first wave. In the univariable analyses, the percentage of urban population, GDP per capita, human development index, volume of international air travel, infectious disease resilience index, prevalence of HIV and latitude were risk factors (Fig. 4 and Supplementary Table 3). The correlation between the time to first case and per capita mortality was not significant (P = 0.22). In the multivariable GLMM, the percentage of urban population (risk ratio (RR) = 1.61, 95% CI 1.25–2.06), volume of international air travel (RR = 1.31, 95% CI 1.04–1.66) and prevalence of HIV (RR = 1.40, 95% CI 1.10–1.78) were risk factors for mortality rate in the first wave (Fig. 4 and Supplementary Table 3). Percentage of urban population was included in all models within +2-corrected Akaike information criterion (AICc) scores (Methods); volume of international air travel and HIV prevalence were included in most but not all.

Fig. 4: RRs and 95% CIs of predictors of per capita mortality in the first wave in univariable and multivariable Poisson GLMM.
figure 4

n = 42 countries. Error bars are shown. Statistically significant risk factors are in red. Exact two-sided P values for the Wald test are shown for each predictor, and two-sided P values < 0.05 were considered statistically significant.

None of the predictors in the best multivariable model was correlated with any of the COVID-19 testing variables (correlation coefficients < 0.6) (Extended Data Fig. 2). We then re-ran the best multivariable GLMMs with each additional testing variable (Supplementary Fig. 1u,w–x). No test variable was associated with the per capita mortality rate and reduced the AICc, and there were no changes in the RRs estimated by the best multivariable model (Extended Data Fig. 3 and Supplementary Table 3).

There was a good consistency between the stringency index and percent change of residential mobility as indicated by the Google mobility data. After controlling for temporal and random effects, the stringency index was non-linearly associated with the residential mobility (P < 0.0001), with an effective degree of freedom of 8.66. The R2 of the model is 0.77, and the explained deviance is 77.5%.

None of the predictors in the best multivariable model was correlated with the two stringency scores (correlation coefficients < 0.6) (Extended Data Fig. 4). Again, we then re-ran the best multivariable GLMMs, once with each stringency score (Supplementary Fig. 1y,z). No stringency score was associated with the per capita mortality rate, and none reduced the AICc (Extended Data Fig. 5 and Supplementary Table 3). We explored other thresholds of cumulative per capita mortality, and all produced consistent results.

There were 11, 10, 10 and 11 countries in the categories of high (area under the curve (AUC) of stringency index)/high (per capita mortality), high/low, low/high and low/low, respectively (Fig. 5a). In the univariable multinomial logistic model, the percentage of urban population, infectious disease resilience index and human development index were risk factors for one or more categories relative to low/low (Extended Data Fig. 6 and Supplementary Table 4). In the multivariable multinomial logistic model, the percentage of urban population and infectious disease resilience index were risk factors for high/high, low/high and/or high/low relative to low/low (Fig. 5b). As above, we also added the three COVID-19 testing predictors into the best multivariable multinomial logistic model, and the results remained consistent (Supplementary Table 4).

Fig. 5: Associations with stringency index.
figure 5

a, Scatter plot for AUC of stringency index and per capita mortality rate in the first wave. Vertical axis has log10 scale. Dashed lines indicate median values, separating countries into four categories: high/high, high/low, low/high and low/low. b, Odds ratios (ORs) and 95% CIs in multivariable multinomial logistic regression model. n = 42 countries. Error bars are shown. Statistically significant risk factors are in red. Exact two-sided P values for the Wald test are shown for each predictor, and two-sided P values < 0.05 were considered statistically significant.

Predictors of per capita mortality during the second wave

We included 42 countries and 19 predictors (Supplementary Fig. 1b–g,j–o,r,s,v–y and Fig. 1c) in the univariable GLMM for per capita mortality in the second wave. Consistent with the results for the univariable analysis of the first wave, human development index, infectious disease resilience index, prevalence of HIV and latitude were risk factors for per capita mortality in the second wave (Extended Data Fig. 7 and Supplementary Table 5). Per capita mortality rate in the first wave was also a risk factor. Disability-adjusted life years (DALYs) per 100,000 individuals from communicable, neonatal, maternal and nutritional diseases was identified as a protective factor.

Discussion

In this study, we identified statistical predictors of the timing of the first case and the per capita mortality rates during the first and second COVID-19 pandemic waves for countries in the WHO African Region. The percentage of urban population, number of international airports, volume of pre-pandemic international air travel, COVID-19 test capacity and number of borders were predictors of the earlier detection of the first case. The percentage of urban population, volume of pre-pandemic international air travel and prevalence of HIV were risk factors for per capita mortality rate in the first pandemic wave. Stringency and timing of government restrictions were not associated with the mortality rate, but countries with higher proportions of urban population and higher infectious disease resilience scores were at increased risk of an adverse outcome, defined as either high AUC of stringency index and/or high per capita mortality. Predictors of per capita mortality rates in the two waves were broadly consistent, and per capita mortality rate in the first wave was predictive of per capita mortality rate in the second wave.

The association between laboratory capacity to test for COVID-19 cases (evaluated before the detection of COVID-19 in the WHO African Region) and earlier detection of first COVID-19 cases was expected. This result highlights the importance and urgency of ensuring adequate preparedness, especially in the earliest stages of a pandemic, noting that COVID-19 was first detected in Africa over 7 weeks after it was first detected in China10.

We found that countries with more international airports and a greater volume of pre-pandemic international air travel detected their first COVID-19 cases earlier, and island nations detected their first COVID-19 cases later. Flight connectivity to China was found to be a risk factor for earlier detection of COVID-19, irrespective of their preparedness status as measured by Global Health Security and Joint External Evaluation scores11, but genome sequencing data suggest that early cases in Africa were mainly imported from Europe and not China12,13.

Pre-pandemic volume of international air travel also predicts per capita mortality during the first wave. We interpret this as indicating that wider seeding of an epidemic before travel restrictions were imposed (as they were in all countries in our study) resulted in a larger epidemic.

A more urban population predicts both earlier detection of COVID-19 and a higher first wave mortality rate. Urban environments are recognized as risk factors for the transmission of respiratory pathogens in general14. Other studies found an association between a more urban population and the number of COVID-19 cases15, and that countries with higher socio-economic development, such as Belgium, United Kingdom and Italy, have higher COVID-19 mortality rates16,17. Countries with a more urban population and greater socio-economic development might have lower COVID-19 case fatality rates (CFRs)15,18. However, our study focused on per capita mortality, as CFR is heavily influenced by COVID-19 testing capability, which is highly heterogeneous across countries3,9,19.

We also found that a higher prevalence of HIV was associated with a higher mortality rate in the first pandemic wave. HIV has been associated with severe COVID-19 during the pandemic; a large population-based study in South Africa found that HIV doubled (HR = 2.14) the risk of COVID-19 mortality20. A meta-analysis of 22 studies worldwide also found that HIV-positive status was associated with an increased risk of COVID-19 mortality21. The underlying reasons might include a high prevalence of comorbidities in patients with HIV and severe COVID-19 and persistent immune suppression in severe COVID-19 (ref. 20). In our study, statistical models replacing HIV with other common comorbidities—tuberculosis (which is strongly correlated with HIV), chronic obstructive pulmonary disease, hypertensive heart disease and obesity—fitted the data less well, although it is possible that HIV status acts as a marker for a basket of these and other comorbidities. Alternatively, any link could be wholly or partially indirect if HIV prevalence is correlated with behavioral, lifestyle or socioeconomic variables not included in our analysis.

We found that stringency and timing of government restrictions were not associated with the mortality rate in the first pandemic wave. Some studies found that measures including internal ‘lockdown’ and rapid border closures were not associated with COVID-19 mortality17,22, whereas others found that rapid implementation of restrictions reduced COVID-19 mortality23. There is a complex cause-and-effect relationship between restrictions and mortality rate, and our results should not be interpreted as demonstrating that restrictions are ineffective, only that any effect is difficult to detect by a retrospective statistical analysis3,24. This is expected if countries that imposed more stringent restrictions more quickly did so in response to the observed or anticipated severity of their epidemic, and if differences in stringency, at best, only partially mitigated the outcome.

As the response to the pandemic is likely to be damaging in its own right (for example, through negative effects on human well-being, the economy, education and work), an alternative approach is to consider stringency as an outcome variable. The preferred outcome is a low per capita mortality rate and fewer restrictions as measured by the stringency index. Taking this approach, we found that countries were more likely to achieve a good outcome if they had a less urban population and low infectious disease resilience. Infectious disease resilience is a composite index that considers multiple factors ranging across multiple domains, including political, economic, public health, medical, demographic and disease dynamics (Supplementary Table 1). It is positively correlated with GDP per capita, the human development index, volume of international travel and prevalence of HIV, and negatively correlated with DALY rates from both communicable diseases and non-communicable diseases (Extended Data Fig. 1). This result contradicts speculation that poor countries with a low resilience would be most affected by COVID-19 (see also ref. 11). In Africa, more urbanized countries and those considered more resilient to infectious diseases suffered more from both the direct and indirect effects of the pandemic.

Similar results for the first and second waves suggest that there were no major shifts in the epidemiology of COVID-19 over the study period, implying no systematic differences in vulnerabilities to the two waves. There was no relationship between stringency of measures taken during the first wave and the severity of the second wave. This indicates that, regardless of the stringency and effectiveness of the government response, intrinsic differences among countries have a substantial effect on the course of national epidemics.

This study has some limitations. It is an observational study of country-level data and cannot demonstrate a direct, causal link between predictors and outcome. Effects due to unmeasured confounders might influence the results and interpretation. Statistical power is limited by sample size, so the final multivariable models include only those predictors with the strongest effects; others might have effect sizes too small to be retained in the models. Given the enormous number of combinations of predictors that could be considered, it is possible that the best fitting models were not identified. Data quality has also been raised as an issue9. Some, possibly substantial, under-ascertainment of COVID-19 deaths is likely in Africa, as elsewhere25, and could affect our findings if the degree of under-ascertainment was correlated with predictors included in our analysis. We directly addressed this issue by including in our analyses independent estimates of under-reporting of COVID-19 deaths generated by the Institute for Health Metrics and Evaluation25. These estimates range up to approximately 75% of COVID-19 deaths unreported (in Burkina Faso, Nigeria and the Democratic Republic of the Congo). The WHO definition of a COVID-19 death does not require a positive test result, but it is possible that ascertainment is influenced by testing capacity. However, our main results are robust to inclusion of indicators of testing effort in our statistical models, although we note that test volume data were not collected over exactly the same time period.

The stringency variable is a composite index of government policies, reflecting that many countries implemented measures as a package. Not all policies are expected to have equal effect, and a wide range of combinations of measures was implemented across the region. We validated the stringency index by comparison with Google mobility data. We found a strong association, indicating that the index is related to real-world behavior by at least a subset of the population. However, the association weakened over time, as has been reported elsewhere26.

Our study had several strengths. We considered countries from a single WHO region; these should be more comparable in terms both of data on predictors and of COVID-19 epidemiology. We restricted our analysis to outcome variables judged to be most reliably estimated—date of first case and mortality—while correcting for under-reporting/under-ascertainment. The evident plausibility of the results of our date of first case analysis improves confidence that the predictor and outcome data are fitted for purpose.

In conclusion, we identified risk factors associated with poor direct and indirect outcomes of the first two waves of the COVID-19 pandemic in the WHO African Region countries. Our key finding is that countries that were assumed to be better prepared and better equipped to respond to the pandemic were also the most vulnerable to it. These data should be taken into account in future pandemic preparedness planning for WHO African Region countries.

Methods

Ethics statement

Ethics approval was not required for this study as the data used in the study were at the country level, and the study is observational.

Study design and study area

We performed a region-wide, country-based observational study (Extended Data Fig. 8) that included all 47 Member States of the WHO African Region. The WHO African Region has a total population of 1,019,922,000, with the median age varying from 15.0 years in Niger to 34.6 years in Mauritius27. About 50% of the population in the WHO African Region lack access to essential medicines28. Globally, 22 of the 25 countries regarded as most vulnerable to infectious diseases are in sub-Saharan Africa29.

We extracted data for daily cases and deaths for each country in the region and calculated the following three outcomes: timing of the first case and per capita mortality rates in the first and second waves. Predictors relating to demographics, socioeconomics, travel, healthcare, comorbidities, readiness and geography were extracted from public data sources. The ratio of total COVID-19 mortality to reported COVID-19 mortality was obtained from the Institute for Health Metrics and Evaluation25. The COVID-19 test data quality and the government response data were collected by the Tackling Infections to Benefit Africa (TIBA) Pandemic Response Unit. COVID-19 testing policy data were taken from the Oxford COVID-19 Government Response Tracker (OxCGRT). Total numbers of tests per capita were collected by the Africa Centres for Disease Control and Prevention (CDC)3. Statistical models were fitted to evaluate the relationships among the three outcomes and predictors. We also ran a secondary analysis for the outcomes per capita mortality in the first wave and stringency index.

The start date of the analysis was set as 25 February 2020 when the first case was reported from the WHO African Region (in Algeria). We collated values of predictor variables as close to this date as possible.

Outcomes

Our first outcome—the timing of the first case—refers to the day on which the first official laboratory-confirmed COVID-19 case/cases was/were reported to the WHO (Fig. 1a), largely based on case definitions defined by the WHO30.

Our other outcomes are the total deaths per 100,000 population (per capita mortality rate) during the first and second waves, adjusted for under-reporting where appropriate (see below). According to international guidelines for certificate and coding of COVID-19 as cause of death31, a death due to COVID-19 is defined for surveillance purposes as a death resulting from a clinically compatible illness, in a probable or confirmed COVID-19 case, unless there is a clear alternative cause of death that cannot be related to COVID disease (for example, trauma). There should be no period of complete recovery from COVID-19 between illness and death. A death due to COVID-19 might not be attributed to another disease (for example, cancer) and should be counted independently of pre-existing conditions that are suspected of triggering a severe course of COVID-19.

The pandemic curve for daily new deaths for the whole WHO African Region was plotted by using 21-d kernel smoothing using the Nadaraya–Watson estimator (Fig. 1b). Kernel smoothing is a common non-parametric method for revealing trends in curves. The Nadaraya–Watson estimator can be seen as a weighted average using kernel as weighting functions, and a higher weight was assigned to daily new deaths closer to the target date32. We chose the date with the first minimum daily new deaths (31 October 2020) as the end of the first wave and the date of the second minimum daily new deaths (14 March 2021) as the end of the second wave, and we calculated per capita mortality rate in each wave for each country.

Data on COVID-19 cases and deaths for all 47 Member States in the WHO African Region were taken from the WHO COVID-19 Dashboard33. The data include daily new cases, cumulative cases, daily new deaths and cumulative deaths.

Predictors

A set of predictors considered likely to affect the timing of the first case and the per capita mortality rate were collected and included as explanatory variables. The definition, reasons for including the predictor, time range, details of missing data and data sources are reported in Supplementary Table 1. Predictors were classified in nine categories: demographics, socioeconomics, travel, healthcare, comorbidities, readiness, geography, COVID-19 testing and interventions. Demographic and socioeconomic variables might predict both vulnerability to severe disease (for example, by age) and transmission potential (for example, urban versus rural populations)19,34. Healthcare, readiness and COVID-19 testing variables might predict the capability to detect and/or treat cases17,35. Travel and the number of shared borders are likely to affect the imported cases from neighboring countries36. Comorbidities are related to vulnerability to dying from infection16. Latitude is related to climate, which might affect transmission rates37.

Data on COVID-19 testing were obtained from four sources. Testing effort was extracted from a recent report of the COVID-19 pandemic in Africa up to the end of December 2020 (ref. 3). The predictor variable was total number of tests divided by per 100,000 population. Testing policy index data were collected by OxCGRT, which records government policy on access to testing. The ordinal scores are shown in Supplementary Table 1, and we calculated days with testing policy index ≥2 during the first wave (25 February to 31 October 2020). Testing policy index at the start of the second wave on 1 November 2020 was used as a baseline predictor for per capita mortality in the second wave. A test data quality index up to 31 October was generated by the TIBA Pandemic Response Unit and was placed into four categories (no data, basic, satisfactory and good; Supplementary Tables 1 and 6). Details of data collection are given in the TIBA COVID-19 testing report38. Estimated ratios of total COVID-19 mortality to reported COVID-19 mortality were obtained from the Institute for Health Metrics and Evaluation (Supplementary Table 1)25.

Government response data were collected by the TIBA Pandemic Response Unit. Details of data collection are given in the TIBA COVID-19 mitigation policies report39. All mitigation responses fall into five categories and 14 subcategories (Supplementary Table 7). Normalized strictness scores were devised for each of the 14 subcategories. Based on these normalized strictness scores, the stringency index representing policies on containment and closure were calculated using a method developed by OxCGRT40, which is by averaging the normalized strictness values of 12 subcategories of measures, excluding all governance and socio-economic measures and surveillance and testing from public health measures.

Two variables related to the stringency index were generated: AUC of stringency index scores from 25 February to 31 October 2020 and stringency index score when cumulative mortality reached 0.1 per 100,000 population during the first wave. Alternative thresholds ranging from 0.001 to 0.2 were also explored for validation.

Google mobility data (https://www.google.com/covid19/mobility/) available for 25 WHO African Region Member States were used to validate the data for the stringency. Details of Google mobility data are included in the TIBA COVID-19 mitigation policies report39. The residential percent change of mobility was used to validate stringency index for the following reasons: (1) the residential category has a high correlation coefficient with the other five categories of mobility; (2) the location accuracy and the understanding vary less across regions than other categories, so the comparison among countries will cause less bias; and (3) the intention of many mitigation response measures is to encourage people to stay in their residence. As of 15 November 2020, 24 of 47 WHO African countries had mobility data for the residential category. Time series plots of stringency index against residential mobility are shown in Extended Data Fig. 9. We used a generalized additive mixed model to estimate the relationships between the stringency index and residential mobility over time. We fitted the residential mobility as a spline function of stringency index s(Stringency index) and a spline function of day of the year s(doy), which was used to control for the temporal trend. The temporal relationship between residential mobility and stringency index can be different among countries, so we also introduced a spline function of country s(country, bs = ‘re’) as random intercepts and country and day of the year (country, doy, bs = ‘re’) as random slopes. The model was expressed as follows.

$$\begin{array}{rcl}g(Y_{ij}) & =& s(Stringency\,index) + s(doy) + s(country,bs = {\,}^{\prime} re^{\prime} )\\ && + s(country,doy,bs = {\,}^{\prime}re^{\prime} ) + \varepsilon _{ij}\end{array}$$

where Yij denoted the residential mobility for the ith day in the jth country, and εij is the random noise. s() indicated penalized spline function. bs = ‘re’ indicated that the basis function is a random effect structure (basis coefficients are penalized by a ridge penalty to control the degree of smoothness). We used the default parameter settings from the R package mgcv for penalized spline function.

Statistical methods

All 47 Member States were included in the model for the timing of the first case, but the number of Member States included in the model for per capita mortality in two waves depended on the completeness of the data. The epidemic curves for both daily cases and deaths in each country within the WHO African Region were plotted to evaluate the completeness of the data. The government of the United Republic of Tanzania stopped reporting COVID-19 cases/deaths from 8 May and, therefore, was excluded.

For predictors, the most recent available data were used—and no earlier than 2010. If one predictor had missing values, one column of binary indicator was added showing which country has missing data and which has not, and both the raw data and the indicator were included in the model. All predictors used had data available for at least 90% of countries.

Spearman’s rank correlation was used to test for a correlation among predictors. Predictors with a correlation coefficient greater than 0.6 were not included in the same multivariable model.

Cox proportional hazards regression models were used to determine HRs and 95% CIs for individual predictors of timing of the first case. A univariable model was fitted first. Only predictors quantified on or before the start date were included in this analysis. Comorbidity data were excluded, as there is no a priori expectation that these would be predictors. COVID-19 test capacity, COVID-19 readiness status and the number of borders entered the model as binary variables where ‘no’, ‘limited and moderate’ and ‘no border’ were set as the reference levels, respectively. For COVID-19 readiness status, we combined ‘limited’ and ‘moderate’ into one single level—‘limited and moderate’, because only two countries were at the ‘limited’ level (Supplementary Fig. 1q). Three countries (Cape Verde, Mauritius and Seychelles) with unknown COVID-19 readiness status were also included in the ‘limited and moderate’ level. Other variables entered the model as continuous variables, and all continuous variables were standardized before entering the model by subtracting the mean and dividing by the standard deviation. Variables with P values less than 0.2 were considered for inclusion in a multivariable model. If multiple variables with P values less than 0.2 were highly correlated (correlation coefficient greater than 0.6), only one variable was selected each time to enter the multivariable model. The multivariable model with the lowest AICc was taken as the best model41, but models with +2 AICc scores were also retained.

We used a GLMM with a Poisson error distribution to identify predictors of per capita mortality rate in the first wave. We used the reported deaths times the ratio of total COVID-19 mortality to reported COVID-19 mortality (Supplementary Fig. 1t) as the outcome, population size as an offset and country as a random effect. The RRs and 95% CIs were calculated. Five countries (the United Republic of Tanzania having incomplete data, Burundi, Eritrea and Seychelles being clear outliers and Seychelles and São Tomé and Príncipe having missing data for HIV prevalence) were excluded (also for the multinomial logistic model below for outcome with respect to per capita mortality in the first wave and stringency). Days with testing policy index ≥2 entered the model as a binary variable (using median as the cutoff) where ‘below median’ was set as the reference level. Three countries (Guinea Bissau, Equatorial Guinea and Comoros) with missing days with testing policy index were included in the ‘below median’ level. We treated test data quality as binary, combining no data and basic data to the lower level (reference level), and satisfactory data and good data to the higher level. Univariable models and the best multivariable model were fitted using the same approach as for the timing of the first case. We then added the two stringency scores (AUC of stringency index in Supplementary Fig. 1y and stringency index when cumulative deaths reached 0.1 per 100,000 population in Supplementary Fig. 1z) to the best multivariable model and checked for significantly improved model fit (lower AICc). We first estimated the correlations between the two stringency scores and the set of selected predictors in the best multivariable model, using the Spearman rank correlation test. Then, we took the best multivariable model and re-ran it by adding each stringency score. Again, only stringency scores with correlation coefficients less than 0.6 with the set of selected predictors were included in the multivariable model. We repeated this exercise for the three testing variables—that is, adding days with testing policy index ≥2 (Supplementary Fig. 1u), test data quality (Supplementary Fig. 1w) and tests per capita (Supplementary Fig. 1x) to the best multivariable model for per capita mortality in the first wave and asking whether the result was consistent after adjusting for COVID-19 testing.

We carried out a secondary analysis using the original set of predictors of COVID-19 mortality in the first wave to predict an outcome combining per capita mortality in the first wave and stringency index. In this analysis, countries were placed into four groups based on the medians of total per capita mortality in the first wave and of the AUC of stringency index (high stringency/high mortality, high stringency/low mortality, low stringency/high mortality and low stringency/low mortality). Multinomial logistic regression was used to estimate the relationship between these outcomes and the set of predictors, and the ORs and 95% CIs were calculated. Univariable models and the best multivariable model were fitted using the same approach as for the first wave mortality rate. Low stringency/low mortality was set as the reference level. COVID-19 readiness status and number of borders were excluded from the model because no country in the low/low level had adequate COVID-19 readiness status, and there was no island nation in the high/high level.

For the second wave mortality rate analysis, we fitted only the univariable model using the same approach as for first wave mortality rate. We dropped predictors related to travel and readiness, given that these pre-pandemic predictors cannot represent the baseline level at the start of the second wave. We added per capita mortality in the first wave (Fig. 1c) and testing policy index on 1 November 2020 (Supplementary Fig. 1v) as two new predictors. Testing policy index on 1 November 2020 entered the model as a binary predictor where ‘below 2’ was set as the reference level. AUC of stringency in the first wave (Supplementary Fig. 1y), test data quality in the first wave (Supplementary Fig. 1w) and tests per capita as of 31 December 2020 (Supplementary Fig. 1x) were considered as predictors of second wave mortality rate, respectively.

R version 3.6.3 (R Foundation for Statistical Computing) was used in all statistical analyses. R packages used for model fitting included survival, lme4, nnet and mgcv. A two-sided P value < 0.05 was regarded as statistically significant. The raw African shapefile used in the study was obtained from Data and Maps for ArcGIS (formerly Esri Data & Maps, https://www.arcgis.com/home/group.html?id=24838c2d95e14dd18c25e9bad55a7f82#overview) (see the permission for use in Supplementary Table 8). Further information on predictors42,43,44,45,46,47,48,49,50,51,52,53 is given as Supplementary Information.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.