Introduction

In April 2020, the European Commission (EC) asked European Mobile Network Operators (MNOs) to share fully anonymised and aggregated mobility data in order to support the fight against COVID-19 (European 2020, 2020) with data driven evidence.

The value of mobile positioning personal data to describe human mobility has been already explored in literature (Csáji et al. 2013) and its potential in epidemiology demonstrated (Wesolowski et al. 2012; Jia et al. 2020; Wu et al. 2020; Kraemer et al. 2020). The new initiative between the Commission and the European MNOs relies on the effectiveness of using fully anonymised and aggregated mobile positioning data in compliance with ‘Guidelines on the use of location data and contact tracing tools in the context of the COVID-19 outbreak’ by the European Data Protection Board (EDPB 2020).

This work makes use of mobile data to introduce a new concept of ‘functional areas’ called Mobility Functional Areas or MFAs. The concept of functional areas has a long tradition in settlement geography, urban planning and policy making (Ball 1980; Van der Laan 1998; Casado-Díaz 2000; OECD 2002; Andersen 2002; Eurostat 2016; Gabrielli et al. 2018; Dijkstra et al. 2019). The idea behind functional areas is the identification of a network of aggregated inbound and outbound movements across spatial structures for a given time scale (for example, daily, intra-weekly, seasonally, etc) according to the scopes of their use.

Thanks to the above mentioned unprecedented collaboration between the European Commission and European Mobile Network Operators during the COVID-19 ongoing pandemic, it has been possible to define the MFAs for 15 European countries (14 member states of the European Union: Austria, Belgium, Bulgaria, Czechia, Denmark, Estonia, Spain, Finland, France, Greece, Croatia, Italy, Sweden, Slovenia, plus Norway), although in this study we will focus on the case of Austria.

Despite the fact that mobility data alone cannot predict future needs (Bwambale et al. 2020), they can show already compelling citizens needs, like transportation or health care facility allocation needs. Moreover, thanks to the capability of collecting mobile data at very high time frequency and space granularity and backward in time, the time evolution of the MFA can indeed show changes or ongoing trends or help to design policies or measure the effectiveness of policies.

This work is organised as follows. “Functional areas and policy making” section is a brief summary of the different approach to the definition of functional areas and explains the importance of this concept in the field of policy making. “Mobile positioning data” section describes the essential characteristics of the mobile data used in the analysis. “Mobility functional areas” section explains in details the method of construction of the MFAs starting from mobility data. As the MFAs are constructed on a daily basis and slightly change daily due to, e.g., intra-weekly human mobility patterns, “Detecting the persistent MFAs” section describes a denoising method to extract the subset of areas which are consistently connected stable through time. In this respect, the pre-lockdown (or persistent) and lockdown MFAs generated by COVID-19 mobility restriction measures are also discussed. “Potential applications of mobility functional areas” section presents several potential applications of the concept of Mobility Functional Areas, including transport system, socio-economic analysis, public health, urban planning and environmental risk analysis. To show the effectiveness of the MFAs concept to policy making, “MFAs and COVID-19 cases during the first wave” section focuses on a public health application. It analyses the impact of the MFAs on the spread of the coronavirus during the first wave of the pandemic in Austria showing that the number of COVID-19 cases increases more within the MFAs than outside. “Conclusions” section discusses limits and further potential usage of MFAs. The MFAs dataset is openly available to other scholars.

Functional areas and policy making

As mentioned, the concept of functional areas has a long tradition in settlement geography, urban planning and policy making. These areas have and can be defined in several ways and for different purposes e.g. transportation planning, urban governance and regional development among others. Our proposed concept of highly interconnected geographic zones, the MFAs, are purely data driven were originally developed to fight the COVID-19 pandemic in Europe. Since MFAs are data-driven products generated by natural human mobility, they can be used for several other applications (see “Potential applications of mobility functional areas” section). A few pre-existing variants of the same concept are the following:

  • ‘Commuting regions’ the identification of relatively closed regions of daily moves of residing population based on commuting data from censuses (Casado-Díaz 2000; Van der Laan 1998). The aim of the commuting regions is to define geographical areas within which the majority of the work to home travels occur. The definition of commuting regions is particularly relevant for housing and transportation planning purposes.

  • ‘Functional regions’ a tool used to target areas of specific national and European policies (OECD 2002). There are several natural areas of application of functional regions including employment and transportation policies, environmentally sustainable spatial forms, reforms of administrative regions, strategic level of urban and regional planning and a wide range of geographical analyses (migration, regionalisation, settlement system hierarchisation) (Andersen 2002; Ball 1980; Casado-Díaz 2000; Van der Laan 1998).

  • ‘Functional urban areas’ cities with their commuting zone (Eurostat 2016; Dijkstra et al. 2019). They are generally identified by a densely inhabited city, together with a less densely populated commuting zone whose labour market is highly integrated with that of the city. They reflect the functional and economic extent of cities and not just their administrative extent. The Functional Urban areas are thus the appropriate spatial level for a more comprehensive urban governance and planning.

  • ‘Overlapping functional regions’ by Killer and Axhausen (2010). The overlapping functional regions reflect the complexity of commuting flows. One area can be linked to several others and thus could belong to multiple functional areas.

More in general, there exists a long tradition in defining zones related to human mobility. In their extensive review Patterson and Farber (2015) summarize more than 61 different ways of defining Potential Path Areas (PPAs) and Activity Spaces (ASs). The PPAs refer to ’the spatial extent of where individuals can participate in activities subject to constraints’ (e.g., time, modal availability, etc); ASs refer to locations ‘with which individuals have direct contact as the result of day-to-day activities’. Our MFAs concept, by design, belongs to the ASs strand of research, though MFAs can be potentially used also in the context of PPAs using a probabilistic setup, but this goes beyond the scope of the present study. A further strand of research is that of Transportation Analysis Zones (TAZs), which is generally (though not exclusively) aimed at the optimization of the transport network at suburban level using very high resolution data and geographical/administrative constraints. Unfortunately, the types of aggregated data of this project do not allow this specific application, though high resolution MNO data (XDR or CDR data) may fit the purpose using the same technique applied in this study. As highlighted in Patterson and Farber (2015), most of these methods have in common some types of calculation like, e.g., ellipse methods (Standard Deviation Ellipses, Travel Probability Fields, etc.), Kernel methods (Kernel Density estimation), network analysis, location methods (specific locations at which activities are performed, for example, shopping malls or trip destinations, Points of interests), Minimum Convex-hull Polygons (MCP), etc. Our MFAs rely on the network analysis approach.

Back to the concept of functional areas, the most common data sources for the above-mentioned studies are by far the population censuses and ad hoc pilot surveys. In these approaches travel behaviour is captured collecting costly household travel survey data, which are usually very small in terms of sample size, rarely updated and sometimes prone to reporting errors (Bwambale et al. 2020).

Mobile data have been used in the past in some pilot-studies like in Estonia (Novak et al. 2013) or in some regions of Italy in Gabrielli et al. (2018). An attempt to mix survey, census and mobile data has been proposed as a proof-of-concept by Bwambale et al. (2020) for the case of Bangladesh.

Very recently Facebook released the concept of Commuting ZonesFootnote 1 (CZs) within their data For Good project. This CZs make use of data coming from the portion of Facebook’s user base who opt-in to share their location in the mobile app. The construction of the CZs is also based on a network analysis approach.

The approach used to build the MFAs presented in the next section is special in several ways. Indeed, the approach has been designed to be robust to the diversified data sources available for the 15 countries and also in that it tries to take into account persistent patterns of mobility rather than focusing on a single period as we know that mobility patterns may change due to seasonalities or because there is a strong intra-weekly periodicity.Footnote 2 This was possible thanks to the long time series made available through the unprecedented agreement.

As for the other functional areas that make use of mobility data, the MFAs are particularly interesting as the underlying data can be collected at high frequency compared to any survey data, and also covering the whole territory of a country rather than focusing on special cities or routes. Remark also that, since mobile phone services unique subscribersFootnote 3 represent about 65% of the population across Europe (GSMA 2020), mobile data can reliably be used to capture the aggregate mobility patterns.

In a policy making framework especially related, but not limited, to the COVID-19 pandemic, the insights resulting from the identification of these new mobility areas may help governments and authorities at various levels:

  1. (a)

    to control virus spread by limiting non-essential movements outwards an MFA with a high virus infection rate compared to its neighbour MFAs especially in the initial phase of a virus outbreak;

  2. (b)

    to apply targeted and local physical distancing policies in different MFAs, according to their specific epidemiological situation, thus minimizing virus spread and at the same time limiting the economic and social impact of such measures by avoiding not adequate national and universal physical distancing policies;

In the absence of any other information, most of the governments are forced to use administrative areas, such as regions, provinces and municipalities to impose physical distancing measures and mobility restrictions. Nevertheless, administrative boundaries are static and do not reflect actual mobility. On the other hand, both the potential spreading of the virus and the territorial economy strongly depend on local mobility (Iacus et al. 2020).

Although not all of these aspects can be taken into account in this work, the hypothesis is that the implementation of different physical distancing strategies (such as school closures or other human mobility limitations) based on MFA instead of administrative borders might lead to a better balance between the expected positive effect on public health and the negative socio-economic fallout for the country. To this aim, ”MFAs and COVID-19 cases during the first wave” section will present a the case study of Austria focusing on the impact of MFAs on the spread of the coronavirus.

Mobile positioning data

The agreement between the European Commission and the MNOs defines the basic characteristics of fully anonymised and aggregate data to be shared with the Commission’s Joint Research Centre (JRC.Footnote 4) This section briefly describes the original mobile positioning data from the MNOs; the following section introduces the mobility indicator derived by JRC and used in this research.

Data from MNOs are provided to JRC in the form of Origin-Destination-Matrices (ODMs) (Mamei et al. 2019; Fekih et al. 2020). Each cell [ij] of the ODM shows the overall number of ‘movements’ (also referred to as ‘trips’ or ‘visits’) that have been recorded from the origin geographical reference area i to the destination geographical reference area j over the reference period. In general, an ODM is structured as a table showing:

  • reference period (date and, if available, time);

  • area of origin;

  • area of destination;

  • count of movements.

Despite the fact that the ODMs provided by different MNOs have similar structure, they are often very heterogeneous. Their differences can be due to the methodology applied to count the movements, to the spatial granularity or to the time coverage. Nevertheless, each ODM is consistent over time and relative changes are possible to be estimated.

Mobility functional areas

The construction of the MFAs starts from the ODM at the highest spatial granularity available. This project handled data with different resolution: municipalities, postal code areas, census areas, and different type of grid-level areas. Table 5 in the “Appendix” reports in details the characteristics of the different sets of data used for to calculate the MFAs. The methodology to derive the MFAs is the same for all the 15 countries for which the data are available, though in this study we will focus on Austria only. As it will be explained below, the proposed method is simple with the aim of being robust to the different data specification. Indeed, even for those countries for which the data are made at disposal by more than one MNO, the final MFAs are quite similar. The analysis in this study considers only the mobility from Monday to Friday (with the exclusion of festivities) and this is because weekdays are more relevant for, e.g., transportation system and socio-economic analysis, though in practice the construction of the MFAs can be done for holidays and weekend days with exactly the same steps. The construction of the MFAs starts from origin-destination matrix. Let \(ODM_{d,i,j}\) an element of the ODM matrix for date d representing the number of movements from area i to area j, \(i,j=1, \ldots , n\)

$$\begin{aligned} ODM^*_{d,i,j} = \frac{ODM_{d,i,j}}{\sum \nolimits _{j=1}^n ODM_{d,i,j}}, \quad i,j, = 1,\ldots , n. \end{aligned}$$

where n is the total number of rows and columns of the ODM (which is a \(n\times n\) matrix) and let \(ODM^*_{d,i,j}\) the corresponding element of the ODM normalised by row. Now we transform the \(ODM^*_d\) matrix into a 0/1 proximity matrix \(P_d\) as follows

$$\begin{aligned} P_{d,i,j} = {\left\{ \begin{array}{ll} 1,&{} \quad ODM^*_{d,i,j} > \text {threshold},\\ 0,&{} \quad \text {otherwise}, \end{array}\right. } \end{aligned}$$

where the threshold has been set to 15% and 30% to derive two types of MFAs, that we will call regular or strict respectively. The threshold of 15% was suggested by Novak et al. (2013) but also used in the functional urban areas concept developed by Eurostat (2016) and Dijkstra et al. (2019). This threshold is a tuning parameter that characterizes the final shapes of the the MFAs. We tested different thresholds above and below 15% including a uniform distribution threshold (i.e., the threshold is set dynamically as the value corresponding to the frequency of a uniform distribution of movements from a given origin to all destinations from this origin that we see in a row of the ODM). Empirically, we can confirm that the 15% seems to be the most effective in isolating stable MFAs for all the countries analysed. It is important to remark that our MFAs are not guided by a theoretical approach as they are completely data driven. Other approaches could have imposed an objective function, like, a minimal number of movements, the maximal length, etc and run an optimization algorithm on top of all the steps presented below to find the optimal threshold. So in this study 15% and 30% are informed a priori choices. But the main argument on fixing a threshold is that we want to isolate directional movements rather ellipse-like areas. Indeed, the higher the threshold the more the MFAs is concentrated along specific directions. One further remark is that by using relative number of movements, we do not discriminate MFAs based on their absolute volume of movements/population. This is an information that can be checked ex-post.

As the ODM matrix is not symmetric, so is the proximity matrix, which is transformed into an adjacency matrix A through the following expression:

$$\begin{aligned} A_d= \frac{1}{2} \cdot \left( P_d + P_d^T\right) \end{aligned}$$

so that each element of \(A_d\) can take only three values:

  • \(A_{d,i,j}=0\) if there are no movements from i to j and viceversa (i.e. the two areas are not connected);

  • \(A_{d,i,j}=0.5\) if there are movements only in one direction, either i to j or j to i.

  • \(A_{d,i,j}=1\) if there are movements in both directions, from i to j and from j to i;

From the adjacency matrix we construct a directedFootnote 5 graph where the vertex represent the areas \(i=1, \ldots , n\) and the edges are weighted according to the matrix A. The MFAs are calculated using a community detection technique called walktrap algorithm (Pons and Latapy 2006), which finds communities through a series of short random walks. This approach is different from the intramax algorithm used in Killer and Axhausen (2010); Novak et al. (2013). The idea is that these random walks tend to stay within the same community. The goal of the walktrap algorithm is indeed to identify the partition of a graph that maximises its modularity. The modularity of a graph is the index designed to measure the strength of division of a network into modules (also called groups, clusters or communities). This is why the walktrap algorithm is well suited for finding clusters of fully interconnected cells where most of the movements are internal.

All the communities with only one member, i.e. those without inbound and outbound movements over the 15% (or 30%) threshold, are collapsed into a single big fictitious area representing the territory that either cannot be identified as a pure MFA, or it is just a collection of atomic (mobility-wise) areas. As this algorithm is data-driven, the number of MFAs is not prescribed in advance but is the outcome of the algorithm itself. For example, for Austria we have more than 200 MFAs per day, some of them are very small, others are quite large. In all events, their shapes depend only on the mobility itself.

Mobility patterns and mobility functional areas

It is well known and expected that mobility changes between weekdays, weekends and holidays, but there might be also an internal variability within the working week as well as across weeks (e.g., not all Mondays are exactly the same in terms of mobility). Therefore, the MFAs might be slightly different from day to day. In order to denoise the MFAs we will first measure how much the MFAs are stable and consistent through time through a similarity index.

We make use of the following similarity index (Gravilov et al. 2000) between two sets of groups of labelled \(G=\{G_1, \ldots , G_K\}\) and \(G' = \{G_1', \ldots , G_{K'}'\}\), where K and \(K'\) are not necessarily equal. For example, when two cluster algorithms are applied to the same set of n observations, it might happen that algorithm 1 produces K groups and algorithm 2 produces \(K'\) groups. In this cluster analysis notation, each \(G_i\) (or \(G_j'\)) is a subset of indexes corresponding to observations that fall in group i (respectively j). In our case, \(G_i\) is the subset of areas id’s corresponding to the ith MFA. The similarity index is defined as

$$\begin{aligned} \text {Sim}(G,G') = \frac{1}{K} \sum _{i=1}^k \left\{ \max \limits _{j=1, \ldots ,K'} \text {sim}(G_i,G_j')\right\} \end{aligned}$$
(1)

where

$$\begin{aligned} \text {sim}(G_i,G_j') = 2 \frac{|G_i \cap G_j'|}{|G_i| + |G_j'|} \in [0,1],\quad i=1,\ldots ,K, \ j=1, \ldots , K'. \end{aligned}$$

with |B| the number of elements in set B.

The similarity index is such that \(\text {Sim}(G,G') \in [0,1]\) but it is not symmetric, therefore in order to have a symmetric measure we consider

$$\begin{aligned} \overline{\text {Sim}}(G,G') = \frac{1}{2} \left( \text {Sim}(G,G') +\text {Sim}(G',G') \right) . \end{aligned}$$

Figure 1 is a heatmap representation of the matrix of similarity index among all MFAs for the period 1 February 2020 - 29 June 2020. Darkest-bluish zones are most different, whereas lighter-reddish are the most similar. It is interesting to observe the difference between before and after the lockdown (14 March 2020), then a slow recovery to normality. It is also worth noticing that holidays and weekends have clearly different mobility patterns than weekdays and that these MFAs are different from the administrative borders (Austrian districts), especially during weekdays.

Fig. 1
figure 1

Similarity index matrix of all MFAs in Austria. Period: 1 February 2020–29 June 2020. It is clear that the persistent (pre-lockdown) MFAs are different from the lockdown MFAs and that slowly, the mobility is going back to normality by the end of June 2020. Lockdown was enforced in Tyrol on 13 March 2020 and nationwide on the 16th of March 2020

To put in evidence the intra weekly patterns in Fig. 2 we calculate the similarity of the MFAs for the same day of the week. We start from the first Monday of the data and we calculate the similarity of it with all the subsequent Mondays. Similarly for the Tuesdays, etc. We also calculate similarity of the daily MFAs with the administrative borders. What clearly emerges from Fig. 2 is that each day of the week has an almost stable pattern before and after the lockdown and in this sense the MFAs are similar among them for the same day of the week. Further, we can notice that the similarity of the MFAs with the administrative borders is quite low, meaning that indeed MFAs do not coincide with the latter. The effect of lockdown is also seen in this curve.

Fig. 2
figure 2

Intra weekly similarity (Mondays vs. first Mondays, etc) of daily MFAs and with respect to the Austrian districts (daily MFAs with respect to administrative borders). Lockdown was enforced in Tyrol on 13 March 2020 and nationwide on the 16th of March 2020

Detecting the persistent MFAs

As seen in the previous section, the MFAs have daily patterns, they change between before and after the lockdown is in force and tend to go back to their original shapes after the ease of containment measures. Moreover, MFAs shows time-variability also for the same weekday; thus, in order to fully exploit their potential, a stable version of the MFAs needs to be identified. Since the number of MFAs changes day by day and the same area may move from an MFA to another (changing the MFA label associated to it), we apply a CO-association method. The CO-association method (CO) avoids the label correspondence problem. It does so by mapping the ensemble members onto a new representation where the similarity matrix is calculated between a pair of objects in terms of how many times a particular pair is clustered together in all ensemble members (Fred and Jain 2005). In other words, CO calculates the percentage of agreement between ensemble members in which a given pair of objects is placed in the same MFA.

As the first lockdown in Austria was enforced on 16 March 2020, we focus on the weekdays from 1 February 2020 to 15 March 2020.

Let d be the data of one of these D weekdays and \(\text {MFA}_d\) the set of mobility functional areas obtained on day d. We then evaluate the co-association matrix

$$\begin{aligned} CO(x_i,x_j) = \frac{1}{D}\sum _{d=1}^D \delta \left( MFA_d(x_i), MFA_d(x_j)\right) \end{aligned}$$

where \(X_i\) and \(x_j\) are the areas and \(MFA_d\) is the set of MFAs for day \(d=1, \ldots , D\) and \(\delta (\cdot ,\cdot )\) is defined as follows:

$$\begin{aligned} \delta (u,v)={\left\{ \begin{array}{ll} 1,&{} \text {if } u \text { and } v \text { belong to the same MFA}, \\ 0,&{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Then, as our scope is to obtain a persistent version of the MFAs, we further threshold the CO matrix so that all entries below 50% are set to 0 and those higher or equal 50% are set to 1 (it means that only cells falling in the same MFA at least 50% of the times are associated with that MFA) leading to a new matrix \(\overline{CO}\).

Then again a directed graph is built with this matrix using the entries of the \(\overline{CO}\) matrix to weight the edges and applying the walktrap algorithm to obtain the final persistent MFA. The same procedure is replicated for the lockdown dates, ending up with a different set of stable MFAs that we denote by lockdown-MFA. With these two sets of MFAs at hand, we further test if they are meaningful to the analysis. It turns out that these stable MFA are in fact reasonably well defined.

We then apply the symmetric similarity index \(\overline{\text {Sim}}(\cdot ,\cdot )\) for all the daily MFAs against the persistent MFAs, the lockdown MFA and the districts (NUTS3). Figure 3.

Fig. 3
figure 3

Persistent and lockdown MFAs for Austria compared to NUTS3 districts (blue lines). Whereas before the lockdown the MFAs extend across provinces, after the lockdown their area generally reduces and they mostly lay within districts’ borders (with some exceptions) or they completely disappear as all mobility remains within individual areas. Areas where connectivity is below 15% (no apparent stable direction is observed in the mobility flows) are transparent. The color scale is only meant to distinguish the MFAs. Colors of the left and right maps are different as the number of MFAs is different. (Color figure online)

In Fig. 4 we calculate the similarity index between the daily MFAs against the lockdown MFAs and, against the districts. It quite evident the abrupt change in the shape of the MFAs on March 16h which corresponds to the stay-at-home nation wide order imposed by the Austrian authorities.

Fig. 4
figure 4

Similarity of daily MFAs (orange line), lockdown MFAs (green line) and Districts (blue line) with respect to persistent MFAs. Daily MFAs are very similar to the persistent MFAs before lockdown which was enforced on 16 March 2020, whereas they are more similar to lockdown MFAs after. A return to normality is slowly appearing. Once again, the MFAs are quite different from administrative borders. (Color figure online)

Potential applications of mobility functional areas

The MFAs, originally developed in the framework of the non-medical interventions to the fight the spread of COVID-19 in Europe, have many other fields of application. In this section, we briefly take an overview of some of these fields.

Transport system

As said, the MFAs are data-driven products generated by natural human mobility, which is in fact strictly related to available local transport system; this is indeed a bi-directional relationship. At the same time, the MFAs can inform about possible new directions in which the transport system can be enhanced. Another interesting aspect of the MFAs is that, especially around big cities, they are not necessarily made of contiguous zones; although this is an artifact, or better a feature of the MFAs due to how they have been designed, starting form Origin-Destination Matrices (ODMs). This does not allow showing that when going from \(\mathbf{a}\) to \(\mathbf{c}\) one can stop by a bar to take a coffee in \(\mathbf{b}\). For example, in Fig. 5 we can see some typical stylized facts of MFAs:

  • MFAs in some cases follow already existing infrastructures, like highways;

  • MFAs are not contiguous;

  • MFAs are not all of the same size and this is clearly related to the relative importance of the geographical zones involved.

Fig. 5
figure 5

The persistent MFAs around Vienna before (top) and during the lockdown (bottom). Colors are randomly assigned to the MFAs, without any relation between top and bottom maps. The largest persistent MFA around Vienna (top map), marked with a bold gray border, is made of non contiguous zones and follows the main highways passing through the capital. This MFA is highly shrunk during lockdown (bottom map). On the other hand, some of the MFAs merge during lockdown and there are no MFAs made by non contiguous zones. This seems indicating that movements are likely to be more local during the lockdown. Background map Open Street Map: Road network in orange; Country boundaries in grey; water surfaces in light blue. (Color figure online)

The largest MFA in the top map of Fig. 5 is clearly made of contiguous and not-contiguous areas and its development follows the the most important highway crossing the capital. These zones, which are highly interconnected by definition of MFA, represent important highlights to eventually improve the transport system accordingly. The bottom map in Fig. 5 shows how the MFAs have shrunk during the lockdown.

Following the same approach used for the Vienna example, but outside the pandemic context, MFAs calculated over two different time periods and compared with each other can also help monitoring the actual usage of transport infrastructures in order to support their management. Moreover, by analysing daily MFAs, local administrations can monitor how mobility changes in occasion of major mass-gathering events, learning how to plan local mobility restrictions and alternative paths.

Socio-economic impact of the pandemic

By comparing the two maps in Fig. 5, it is evident how MFAs can show major changes in human mobility patterns. Eventually, with MFAs built starting from higher spatial-resolution data, it would have been also possible to asses the socio-economic impact of the lockdown measures at urban level. As an example, one would have been be able to identify neighborhoods where the residential population kept moving for working-related reasons (because they could not telework or they were essential workers), despite the mobility restrictions and the risk of being infected. Another example is the ability to monitor whether or not key commercial areas (such as malls, shopping hubs, etc.) had remained connected in terms of mobility with the high-population-density neighborhoods.

Public health

The MFAs were originally designed to support the fight to the spread of COVID-19 in Europe. Their role in the public health sector, specifically during a pandemic, is therefore evident and is further developed in “MFAs and COVID-19 cases during the first wave” section. Beside their employment in an emergency context, MFAs could be also exploited to plan and monitor the accessibility of local health units, main hospitals and any other type of health facility.

Urban planning

The persistent MFAs can help authorities and spatial planning agencies to take more evidence-based and informed decisions. Sustainable spatial planning and spatial management requires the identification of the optimal location for both private and public facilities and services that minimize transportation needs while maximizing service availability. For example, MFAs can assist spatial planners and developers on identifying the optimal location of new housing developments, schools and sports facilities.

Environmental risk and pollution

Since the MFAs show human mobility, when it comes to environmental monitoring and risk management, they can be profitably used to design the displacement of monitoring stations, in order to measure those pollutant associated to road-mobility. Analogously, MFAs could support the planning of new waste-to-energy plants, wastewater treatment plants biomass plants, etc, outside the main human-mobility areas.

Demography and migration

MFAs can be used to identify disadvantaged/isolated areas which are not well connected to economically developed MFAs. These areas are more likely to experience an aging population as well as a depopulation. Besides, without any administrative intervention, these areas are likely to be less attractive to international migrants (Aurambout et al. 2021).

MFAs and COVID-19 cases during the first wave

In this section, we analyse the role of MFAs in the contest of the spreading of COVID-19. The objective is testing the hypothesis that MFAs are correlated to the evolution of the number of cases until strict measures to limit mobility are enforced. For this reason, we focus on the period between 15 March 2020 and 30 April 2020, monitoring the number of cases in the areas laying within or outside the persistent MFAs. It is worth noticing that the actual timeline of the measures taken is the following:Footnote 6 (a) On 13 March 2020 the entire valley surrounding Ischgl was in lockdown; (b) On 15 March the state of Tyrol was in lockdown; (c) From 16 March a nationwide lockdown was implemented in Austria. Thus given the limited period of time that local measures were in place we do not expect any endogeneity issues in the statistical analysis that follows. Figure 6 shows the relative change in mobility for Austria during the period 1 February 2020–30 June 2020.

Fig. 6
figure 6

Relative mobility change in Austria during the period 1 February 2020–30 June 2020

It is known in the literature that it takes about 14 days (Iacus et al. 2020) to see the decrease in the number of cases due to a lockdown, so we should expect a peak around the 30th of March and then a decrease of cases as indeed Fig. 11 shows.

In Austria we have 222 persistent MFAs (plus the residual territory) built on 4789 areas identified by the MNO, while the number of political districts (GKZ) is 94. The GKZ do not match with the NUTS3 districts used in “Mobility functional areas” and “Detecting the persistent MFAs” sections and these are the geographical units for which epidemiological data is available. Of the 4789 zones, 3409 belong to a proper MFA and 1380 are in the residual territory of Austria. Not all the MFAs are big enough to be relevant either because their population is small or because their territorial extent is very limited. Figure 7 shows the distribution of MFAs by population size and Table 1 shows the corresponding intersecting districts. As it can be seen, in all but one case, these MFAs contain the territory of Wien. Table 2 reports the extension of the MFAs in km2. Using high resolution Facebook population data,Footnote 7 we reconstruct the population of each of 4789 the zones. Figure 8 shows the shapes of the MFAs compared to those of the political districts for which health data are available and also the estimated population for each zone of the MFAs.

Table 1 The MFAs with more than 20000 inhabitants
Table 2 Distribution of the areas of the MFAs in km2 before (persistent) and during the lockdown
Fig. 7
figure 7

Distribution of population by MFA. Only 5 MFAs are large, i.e. more than 20,000 inhabitants, in population size and all but one overlaps with Wien district (see also Table 1)

Fig. 8
figure 8

MFAs compared to GKZs borders (left) and the estimated population on the MFAs (right). The MFAs also cross the administrative borders on which the health data are available

It is also worth to mention that the daily density of movementsFootnote 8 in the residual MFA is equal to 0.00012, while on average the other MFAs have a density of 15.016 (Q1 = 4.065, median = 8.100, Q3 = 13.680, max = 294.111), meaning that most of the mobility is observed in the identified MFAs. Notice further that the residual MFA is made of 1553 cells of the grid, while on average an MFA is made of 14 cells with a maximum of 279 (Q1=3, Q3=15).

As said, the COVID-19 health data for Austria are available only at political district level from the open data repositoryFootnote 9 of the Federal Ministry of social Affairs, Health, Care and Consumer Protection.Footnote 10 As it is know that the number of infection cases is proportional to the population size, we redistribute the number of COVID-19 cases on the 4789 zones proportionally to the population size of the zones within each district. Figure 10 shows the number of cases in the last seven days for four equally spaced dates from 15 March to 30 April 2020. These maps seem to confirm qualitatively the role of the MFAs in the spread of the virus. To assess this qualitative evidence with a quantitative index, prior to the statistical analysis that will follow, we can use the index of determination (Pearson’s correlation ratio) \(\eta ^2_{Y|G}\) (Fisher 1926) which measures the dependency of a continuous variable Y in terms of a categorical variable, usually a grouping variable. The index is based on the decomposition of the total variance of Y into the sum of the variance within the groups (\(\sigma ^2_W\)) and the variance between the groups (\(\sigma ^2_B\)):

$$\begin{aligned} \sigma ^2 = \sigma ^2_W + \sigma ^2_B =\frac{1}{N} \sum _{j=1}^K n_j\sigma ^2_j + \frac{1}{N} \sum _{j=1}^K (\bar{x}_j - \bar{x}_N)^2 n_k \end{aligned}$$

where \(\bar{x}_j\) and \(\sigma ^2_j\) are respectively the mean and the variance of group j, \(\bar{x}_N\) is the global mean, and \(N=n_1+n_2+\cdots n_k\), is the total number of observations. The index of determination is given by the formula

$$\begin{aligned} \eta ^2_{Y|G} = \frac{\sigma ^2_B}{\sigma ^2} \quad \in [0,1]. \end{aligned}$$
(2)

When the within variance is small it means that the grouping variable creates homogeneous groups in terms of the variability of Y, in which case \(\eta ^2_{Y|G}\) reaches an high value. To compare the effectiveness of the MFAs to capture the dynamics of the pandemic, we calculate the \(\eta ^2_{Y|G}\) index using the MFAs or GKZs as grouping variables, and we take the sum of number of cases in the past seven days in each MFA or GKZ as response variable Y. The results are given in Table 3 and plotted in Fig. 9. The results show that most of the time the value of \(\eta ^2_{Y|MFA}\ge \eta ^2_{Y|GKZ}\) from 15 March 2020 till 20 April 2020, then the reverse is true. This is expected and confirmed by later analysis, because the impact of the MFA can be present only up to few weeks after national lockdown are put in place.

Fig. 9
figure 9

The index \(\eta ^2_{Y|G}\) by calendar week by group G = MFA or G = GKZ. The higher the value of \(\eta ^2_{Y|G}\), the more Y is determined by G. Notice that \(\eta ^2_{Y|MFA}\ge \eta ^2_{Y|GKZ}\) till 20 April 2020, then the reverse is true. This is expected because few weeks after the national lockdown, the impact of the MFA does no longer appear as the mobility has been reduced or stopped completely

Table 3 The index \(\eta ^2_{Y|G}\) by calendar week by group G = MFA or G = GKZ

We now test our hypothesis in two ways. First of all we consider a simple linear model for the log number of cases in the last 7 days and the indicator function \(\mathtt mfaInd\) which takes value equal to 1 if the zone belongs to a MFA and zero otherwise. We also control for the \(\mathtt population\) size of the zone. The model is as follows

$$\begin{aligned} y_{i,t} = \alpha + \beta \cdot \mathtt{mfaInd}_i + \gamma \cdot \mathtt{population}_i, \end{aligned}$$
(3)

for t = 15 March 2020, \(\ldots \), 30 April 2020, \(y_{i,t}\) is the logarithm of the number of COVID-19 cases in the last 7 days for the zone i, \(i=1, \ldots , 4789\).

Fig. 10
figure 10

The number of cases in past 7 days for the dates 15 and 31 March, 15 and 30 April 2020 on the MFAs (left) and on the GKZs (right). It looks like the COVID-19 spread seems to follow more closely the MFAs borders than the GKZ borders

Figure 11 shows the sign of the standardizedFootnote 11 coefficient of the \(\mathtt mfaInd\) as well as those of \(\mathtt population\) and the constant of the model, along with the corresponding adjusted \(R^2\) goodness of fit index for the model. Thanks to the standardization, we can see that, although the population size has the main effect, the belonging to an MFAs is also always significant (the plots shows only coefficients whose p-values are less than 0.01). We use the number of total cases in the past 7 days in order to take into account lagged effects of mobility on the evolution of the pandemic.

It turns out that the MFA indicator function has always a positive coefficient \(\beta \) with p-value less that 0.001, i.e., always statistically significant. Moreover, the overall fit of the model is acceptable being in the range 0.4-0.5 most of the times during the peak of the epidemic. The fact that towards the end of the period the MFA indicator variable has a coefficient decreasing to zero is due to the fact that the mobility pattern changed due to national lockdown measures (see also Fig. 6) and thus the MFA are quite different.

Fig. 11
figure 11

Left to right: the number of cases in past 7 days, the value of the coefficients \(\alpha \), \(\beta \) and \(\gamma \) in model (3) and the adjusted \(R^2\) of the estimated model (3) for each date. Coefficients are always significant at level 0.01

Model (3) is capturing an average effect of MFA impact on the number of COVID-19 cases. To disentangle the impact of each MFA on the pandemic, we use a random effect linear model, also known as mixed effect model (Bates and DebRoy 2004) or hierarchical model (Gelman and Hill 2006). In the form we use it is just a regression with clustered errors that depend on a grouping variable. In our case, the dependent variable is again \(y_{i,t}\) but we replace in (3) the indicator function \(\mathtt mfaInd\) with the MFA categorical variable to obtain Eq. (4) below. The MFA variable has 223 different values, so it implies 222 new dummy variables in the model, most of them correlated, and for which there are not much observations per groups as some MFA are made of two or few more zones. To solve this problem we apply the random effect model as implemented in Bates et al. (2015). In practice, we cluster the observations in groups identified by the political districts (\(\mathtt GKZ\)). Exploiting the correlation between the GKZs and the MFAs it is possible to estimate all the variances of the MFA dummy variables and hence test the significance of each coefficient. The random effect model can be written in this form, where “\(\, | \,\mathtt{GKZ}\)” means conditionally to in the sense of Bates et al. (2015):

$$\begin{aligned} y_{i,t} = \alpha + \beta \cdot \mathtt{mfa}_i + \gamma \cdot \mathtt{population}_i \, | \,\mathtt{GKZ}, \end{aligned}$$
(4)
Fig. 12
figure 12

Left to right: coefficients of the MFA categorical variable for different dates. Only the coefficients which are statistically different from zero are shown. The only notable MFA is the MFA number 181, for which the sign of the coefficient is negative. The colors of the points are not relevant. (Color figure online)

Figure 12 shows the estimated coefficients, with their respective confidence intervals, for the same four dates of Fig. 10. Only the coefficients which are significant at 0.01 are shown. It can be noticed a couple of things: (1) the number of MFAs retained changes through time though those MFAs that remain have a positive coefficient, meaning that being in those particular MFAs increases the number of cases. (2) it is possible to see that the only MFA which has a negative coefficient if the number 181 but this happens at the end of the period of analysis.

A dynamical view which involves all significant MFAs during the whole period of the analysis is presented in Fig. 13 which shows the heatmap of the corresponding MFAs coefficients which are statistically significant at 0.01 level in all the dates analyzed. It is again clear that the only special MFA is the number 181 that becomes significant and negative starting from 17 April 2020 when all the other MFAs are no longer significant. But around 15 April, the number of cases is also decreasing very quickly towards zero, meaning that there is essentially no signal in the data, i.e., there are no longer cases to analyze.

This MFA number 181 is made of zones in the Wien Stad and Sankt Pölten Land, a small district on the west border to the city of Vienna. The conclusion here seems to be that when the number of cases goes down due to countermeasures (lockdowns, curfews, etc) the impact of the persistent (pre-lockdown) MFAs is clearly vanishing, while until the lockdown, the MFAs are largely driving the pandemic. The negative impact related to the MFA number 181 is mainly due to the lack of cases for all the other MFAs and a few left cases in that MFAs, which naturally leads to a negative sign of the regression coefficient. The other MFA which shows a negative coefficient in a single date, is the MFA number 219 that corresponds to the small village of Weiz. As this is one date only, we do not think that this is a statistical evidence so we do not comment any further.

Fig. 13
figure 13

The impact of the MFA on the log number of cases. Blue dots mean that the corresponding MFAs is impacting (increasing) the number of cases; red dots means the contrary. transparent dots means that the corresponding regression coefficient is not significant. (Color figure online)

Table 4 shows that in terms of population size and therefore number of COVID-19 cases, the MFAs which are not statistically significant are in general smaller. This type of results in quite natural as it means that the areas to be contained or monitored are correctly identified in our approach. This implies also that the MFA with higher population can be isolated if (1) they have too many cases, in which case there is high probability of exporting cases; or if (2) there are no infections, in which case it is possible to avoid the import of new cases (Iacus et al. 2020).

Table 4 Elementary population size statistics (top) and number of COVID-19 cases (bottom) of the MFAs

To summarize, this analysis clearly shows the impact of MFAs on COVID-19 spread (along with other causes) and can inform policies in future waves. Indeed, a potential implication of this analysis is that, if new cases arises in a zone belonging to a specific MFA, that MFA should be isolated (or tested extensively) to efficiently contain further spread of the virus, rather than focusing on the whole or single district (when MFAs crosses districts) or instead of blocking the whole country, in the spirit of the zero-covid strategy.Footnote 12

Conclusions

The present work, in line with the literature on functional regions, puts in evidence how administrative borders typically differ from actual commuting patterns, which are shaped by human mobility. An innovative way to reflect human mobility patterns is given by the ‘Mobility Functional Areas’ (MFAs). The MFAs are data-driven geographic entities derived from fully anonymised and aggregated data originally provided to the European Commission by several European Mobile Network Operators (MNOs) in the framework of the non-medical interventions to contrast the spreading of COVID-19 pandemic in Europe.

Though slightly changing day by day, MFAs are essentially persistent in time and identify clear intra-weekly patterns. We have shown how this patterns have changed in Austria following mobility-restrictions imposed by the national authorities to limit the spreading of the virus. Analogous changes in mobility patterns have been measured, with different extents, in all the European countries that we have analysed. In fact, national lockdowns and similar mobility-restriction measures at the local level have not only reduced the volume of mobility but have also drastically changed the mobility patterns and this “shrinking effect” is clearly reflected in the shape and distribution of the MFAs.

The paper shows how, at a very local scale, MFAs can support a more efficient planning of the transport system, as well as its continuous monitoring. It also introduces several other potential fields of application for the MFAs (i.e. socio-economic, public health, urban planning, environmental risk and pollution, and demography and migration), which should be further investigated in future researches. Finally, it provides statistical evidences that during the initial phase of the pandemic, the MFAs played an important role in the spread of the virus. On this bases, an enforcement of mobility restrictions based on MFAs rather than administrative areas would have been more effective in limiting both the spreading of the virus and the socio-economic impact of the restrictions. That said, the authors acknowledge that from an operative point of view, the enforcement of mobility restrictions based on administrative borders (which are well recognised by citizens and reflect administrative and law-enforcement areas) would be much easier to implement than one based on MFAs, which very often cross these administrative borders. For example, in case of a local outbreak in a municipality, instead of applying restrictive measures in a whole region or country health authorities can only apply them in the municipalities that belong to the same MFA. For this reason, coordination efforts between local and regional authorities are required.

While this study focuses on Austria, the full study comprises the analysis of “Mobility functional areas” to “Potential applications of mobility functional areas” sections, for 15 European countries as shown in Table 5 and the definition of the MFAs and the corresponding shapefiles are available through the KCMD Dynamic Data HubFootnote 13 for further analysis by scholars.

The authors would also like to acknowledge the GSMA,Footnote 14 colleagues from DG CONNECTFootnote 15 for their support and colleagues from EurostatFootnote 16 and ECDCFootnote 17 for their input in drafting the data request.

Finally, the authors would also like to acknowledge the support from JRC colleagues, and in particular the E3 Unit, for setting up a secure environment, a dedicated Secure Platform for Epidemiological Analysis and Research (SPEAR) enabling the transfer, host and process of the data provided by the MNOs; as well as the E6 Unit (the Dynamic Data Hub team) for their valuable support in setting up the data lake.