Analysing Human Mobility in Thailand to improve Epidemiological Models: Analysis Summary
Overview
This study uses human mobility networks and dengue case data (2003-2024) from Thailand to test whether mobility-based dengue importation could help understand dengue transmission. The study covered multiple spatial scales:
- Province level: 77 provinces, 20,328 observations (monthly, 2003-2024)
- Bangkok District level: 50 districts, 8,600 observations (monthly, 2008-2022)
- Bangkok Subdistrict level: 161 subdistricts, 27,692 observations (monthly); 47,700 observations (weekly, 2020-2024)
We compared several mobility data sources: mobile phone call detail records (CDR), classical gravity models, Deep Gravity neural networks and gravity flow models (constructed from the transportation network), and the raw transport network itself.
Importation Risk Index
The importation risk index estimates the likelihood that infected travellers will arrive at a target location from other connected locations, based on two factors: the number of cases in connected locations, and the volume of travel from those locations. Following standard metapopulation models, we calculate the importation risk index at an area i in time t as:
where Ii,t represents the importation risk index at location i at time t, Wji is the probability of travel from location j to location i (derived from the mobility matrix), and is the number of dengue cases in origin location j at time t.
A location receives high importation risk when it has strong travel connections to places currently experiencing high dengue burden.

Example importation risk calculation for Saraburi province, July 2023. (a) Conceptual diagram showing importation risk accumulation. (b) Geographic sources of importation risk.
Each source contributes its case burden weighted by the share of travel it sends to Saraburi. Chiang Rai (2,550 cases, 0.04% travel share) contributes only 1.0, while Tak (551 cases, 0.59% share) contributes 3.3, illustrating that network connectivity matters as much as outbreak intensity. Provinces are coloured by July 2023 case burden; arrow colour and thickness indicate the magnitude of contribution. In the following sections, we examine whether this importation risk measure predicts dengue incidence in subsequent months at the receiving location.
Mobility Data Sources
We tested several approaches to construct the mobility weight matrix. The weight matrix W is row-stochastic:
This ensures that for all areas.
The following transportation networks (T in the formula) were used to construct different mobility weighted matrices:
Raw Transport
We directly aggregated observed transport flows from bus (8,036 routes), ferry (177 routes), train (386 routes), and flight (434 routes) data. Traffic volumes between provinces were computed by summing all district-to-district flows.
Call Detail Records (CDR)
Mobile phone call detail records from Kiang et al. (2021) capture actual human movement patterns from roughly 11 million subscribers across Thailand (August-October 2017). The CDR-derived mobility matrix provides a 76 by 76 province-level row-stochastic matrix with strong home bias (diagonal mean of 0.866).
Classical Gravity Model
This model predicts flows based solely on population and distance:
Here, and are the origin and destination populations, is the geographic distance, and we use standard parameters (==1, =2 for the inverse square law). This provides a baseline that requires no transport data.
Gravity Flow Model (Mobility-Weighted)
This approach incorporates observed transport connectivity as weights in the gravity formulation:
where is the transport weight derived from observed flows. Destination choice probabilities are row-normalised, then home bias is incorporated:
where is the mean fraction of time residents spend away from home, and ij is the Kronecker delta.
Deep Gravity Model
This neural network-based approach follows Simini et al. (2021). The architecture comprises 15 hidden layers (6 layers of 256 units plus 9 layers of 128 units) with LeakyReLU activation and per-origin softmax for probability output. Input features (39 per origin-destination pair) include population totals from the Department of Provincial Administration (DOPA), socioeconomic indicators such as Gross Provincial Product (GPP) per capita, poverty rate, and Gini coefficient, network centrality measures (PageRank and Eigenvector centrality), and geographic information including coordinates, regional classification, and inter-province distances.
Model Specification
We fitted negative binomial generalised linear models with fixed effects.
Province Model (Lag-1 only)
Here, captures temporal autocorrelation, is the importation risk at lag-1, and FE includes unit and time fixed effects.
Bangkok Monthly Model (Lags 1, 2, 3)
For the Bangkok district and subdistrict analyses:
Bangkok Weekly Model (Lags 2, 3, 4 weeks)
Weekly resolution uses biologically appropriate lags that match dengue generation time (2-4 weeks):
These coefficients are elasticities. A coefficient of 0.45 means that a 10% increase in importation risk is associated with a 4.5% increase in expected dengue cases.
Interpreting Autocorrelation Reduction
High dengue cases this month, following high cases last month, could reflect either local persistence (ongoing transmission from resident mosquitoes and infected people) or continuous importation (infected travellers arriving from connected outbreak areas). A baseline model without importation risk conflates both effects in the autocorrelation coefficient.
When adding importation risk reduces autocorrelation, it suggests that some apparent “persistence” may have been network-driven importation. Three findings support this interpretation: (1) COVID-19 travel restrictions were associated with a 12% reduction in inter-province importation risk effects, consistent with mobility influencing spread; (2) continued intra-city travel during COVID was associated with 23-71% higher district and subdistrict effects, tracking actual mobility changes; (3) CDR outperforms gravity models at province level while classical gravity works better at subdistrict level.
A 49% autocorrelation reduction means that nearly half of the apparent “local persistence” may have been continuous re-seeding from connected outbreak areas.
Results
Thailand Province Level
Data: 77 provinces, January 2003 to December 2024, 20,328 observations
Province-Level Panel Regression: Mobility Source Comparison
| Mobility Source | AIC | (IR Lag-1) | Autocorr | Change |
|---|---|---|---|---|
| Baseline (no IR) | 183,756 | – | 0.799 | – |
| CDR (Mobile Phone) | 181,512 | 0.452*** | 0.410 | -49% |
| Classical Gravity | 183,392 | 0.226*** | 0.662 | -17% |
| Gravity Flow (Combined) | 183,538 | 0.201*** | 0.670 | -16% |
| Deep Gravity | 183,515 | 0.186*** | 0.729 | -9% |
| Raw Transport | 183,644 | 0.125*** | 0.765 | -4% |
| **p<0.001 | ||||
CDR mobility provides the strongest epidemiological signal, with a 2,244-point AIC improvement over baseline. Raw transport network is the strongest at the district level, and the classical gravity model serves as the best proxy when CDR data are unavailable, marginally outperforming both transport-weighted gravity flow (AIC improvement of 218) and neural network-based deep gravity (AIC improvement of 241). When CDR-based importation risk is added, the autocorrelation coefficient drops from 0.799 to 0.410 (a 49% reduction), indicating that CDR captures variance previously absorbed by temporal persistence.

Thailand-Province level regression coefficients comparing five mobility data sources. CDR provides the strongest signal (0.452), followed by classical gravity (0.226), gravity flow (0.201), deep gravity (0.186), and raw transport (0.125). Error bars show 95% confidence intervals. All coefficients are significant at p<0.001.
Bangkok District Level (Monthly)
Data: 50 districts, 2008-2022, 8,600 observations
Bangkok-District: Mobility Source Comparison (3-Lag Model)
| Mobility Source | AIC | (Lag-1) | (Lag-2) | (Lag-3) | Autocorr |
|---|---|---|---|---|---|
| Raw Transit | 59,813 | 0.611*** | 0.118** | -0.038 | 0.260 |
| Gravity Flow | 59,823 | 0.470*** | 0.060 | -0.034 | 0.449 |
| Classical Gravity | 59,829 | 0.483*** | 0.045 | -0.025 | 0.456 |
| **p<0.001, **p<0.01; N = 8,450 | |||||
Raw transit schedules capture commuter movements well at the district level. The primary effect occurs at lag-1 (coefficient of 0.611), with a smaller secondary effect at lag-2 (0.118). The lag-3 coefficient is not statistically significant, indicating effect saturation in the monthly data.
Bangkok Subdistrict Level (Monthly)
Data: 161 subdistricts, 2008-2022, 27,692 observations
Bangkok-Subdistrict: Mobility Source Comparison (3-Lag Model)
| Mobility Source | AIC | (Lag-1) | (Lag-2) | (Lag-3) | Autocorr |
|---|---|---|---|---|---|
| Classical Gravity | 116,896 | 0.730*** | 0.055*** | -0.069* | 0.399 |
| Gravity Flow | 117,045 | 0.640*** | 0.063** | -0.060* | 0.436 |
| Raw Transport | 117,194 | 0.642*** | 0.125*** | -0.060* | 0.369 |
| **p<0.001, **p<0.01, *p<0.05; N = 26,871 | |||||
At the subdistrict level, simple population times distance predicts movement (classical gravity) better than transit schedules. The primary effect is at lag-1 (0.730 for classical gravity), with a secondary effect at lag-2 (0.055). The negligible negative lag-3 coefficient (-0.069) reflects effect saturation in the monthly data.
COVID-19 as a Natural Experiment
The COVID-19 pandemic provided a natural experiment to test the importance of the mobility networks in explaining transmission. During this period, mobility restrictions reduced travel, allowing us to test whether network effects changed accordingly.
COVID Period Comparison: Thailand-Province vs Bangkok-District/Subdistrict (Single-Lag Models)
| Scale | Period | Autocorr | IR Lag-1 | Change |
|---|---|---|---|---|
| Thailand-Province | Pre-COVID | 0.384 | +0.452 | – |
| COVID | 0.554 | +0.398 | -12% | |
| Post-COVID | 0.494 | +0.413 | -9% | |
| Bangkok-District | Pre-COVID | 0.236 | +0.568 | – |
| COVID | 0.144 | +0.970 | +71% | |
| Post-COVID | 0.200 | +0.700 | +23% | |
| Bangkok-Subdistrict | Pre-COVID | 0.401 | +0.723 | – |
| COVID | 0.298 | +0.891 | +23% | |
| Post-COVID | 0.352 | +0.780 | +8% |

Temporal period analysis showing COVID-19 effects on regression coefficients. The importation risk effect decreased during COVID, while autocorrelation increased, consistent with reduced inter-province travel.
COVID travel restrictions were associated with a 12% reduction in inter-province network effects and a 23-71% increase in intra-city effects. This pattern is consistent with mobility-associated transmission operating through distinct mechanisms at different spatial scales.
Weekly Validation (Bangkok, 2020-2024)
Although higher-resolution weekly dengue data were only available for a shorter period (2020-2024), we used this data to validate the importation risk findings at lags matching dengue transmission patterns (2-4 weeks from importation).
Weekly Combined Multi-Lag Model (Bangkok District, 2020-2024)
| Variable | Coefficient | SE | p-value | Meaning |
|---|---|---|---|---|
| Autocorrelation | 0.335*** | 0.012 | <0.001 | Local persistence |
| Importation Risk (2-week) | 0.553*** | 0.031 | <0.001 | Primary transmission |
| Importation Risk (3-week) | 0.297*** | 0.029 | <0.001 | Secondary cases |
| Importation Risk (4-week) | 0.242*** | 0.028 | <0.001 | Tertiary cases |
| N = 13,050; AIC = 46,747; Autocorrelation reduction = -62% | ||||
Weekly Combined Multi-Lag Model (Bangkok Subdistrict, 2020-2024)
| Variable | Coefficient | SE | p-value | Meaning |
|---|---|---|---|---|
| Autocorrelation | 0.461*** | 0.008 | <0.001 | Local persistence |
| Importation Risk (2-week) | 0.731*** | 0.029 | <0.001 | Primary transmission |
| Importation Risk (3-week) | 0.377*** | 0.027 | <0.001 | Secondary cases |
| Importation Risk (4-week) | 0.242*** | 0.025 | <0.001 | Tertiary cases |
| N = 41,022; AIC = 75,760; Autocorrelation reduction = -51% | ||||
Weekly resolution with 2-4 week lags shows all lags positive and significant. The negligible monthly lag-3 negation effect reflects the three-month span exceeding typical dengue generation time, whilst weekly lag-4 (one month) remains within biological relevance. The subdistrict model shows even stronger effects than the district level (0.731 versus 0.553 at the two-week lag).
Spike Propagation
We defined spikes as months where cases exceeded 20% above the seasonal baseline:
Spike pressure measures network exposure to connected regions with dengue spikes:
Spike Probability by Network Exposure Across Scales
| Scale | Low Exposure | High Exposure | RR | p-value |
|---|---|---|---|---|
| Thailand-Province | 22.6% | 39.6% | 1.75 | <0.0001 |
| Bangkok-District | 23.9% | 51.5% | 2.16 | <0.0001 |
| Bangkok-Subdistrict | 27.6% | 46.9% | 1.70 | <0.0001 |
Areas with high network exposure are roughly twice as likely to experience outbreak spikes (relative risk of 1.70 to 2.16 across scales).
Ideal Data Source per Scale
Optimal Mobility Data Source by Spatial Scale
| Scale | Units | Best Mobility Source | Reason |
|---|---|---|---|
| Thailand-Province | 77 | CDR (mobile phone) | Captures actual long-distance travel patterns |
| Bangkok-District | 50 | Raw Transit schedules | Commuter routes (bus/metro) dominate |
| Bangkok-Subdistrict | 161 | Classical Gravity | Local movement follows population and distance |
- Summary

Summary of key findings across spatial scales: Thailand-Province (77 units), Bangkok-District (50 units), and Bangkok-Subdistrict (161 units). (A) Network effect strength increases at finer scales, with different optimal mobility sources. (B) All scales show substantial autocorrelation reduction when importation risk is included. (C) Spike probability comparison between low and high network exposure, with relative risk shown. (D) COVID-19 effects on importation risk coefficients varied by scale.
Summary Comparison Across All Spatial Scales
| Metric | Thailand-Province | Bangkok-District | Bangkok-Subdistrict |
|---|---|---|---|
| Units | 77 | 50 | 161 |
| Best mobility | CDR | Raw Transit | Gravity |
| Coefficient (Lag-1) | 0.452 | 0.684 | 0.723 |
| Autocorr change | -49% | -67% | -46% |
| COVID effect | -12% | +71% | +23% |
| Spike RR | 1.75 | 2.16 | 1.70 |
Importance of Mobility Networks
These findings translate into practical tools for dengue surveillance and response.
Provinces with high exposure to upstream outbreaks are 75% more likely to experience their own spike within a month. The 2-4 week lag structure provides an early warning window: when one province spikes, health officials can alert connected neighbours before local cases appear.
Transport connectivity matters more than geographic distance. A province with heavy bus traffic to an outbreak area faces an elevated risk even if geographically distant. Weighting cases by mobility connections captures this better than distance-based models.
COVID provided a natural experiment: inter-province travel restrictions reduced network transmission effects by 12%, while continued intra-city travel increased local spread. This suggests inter-city travel advisories may be more effective than within-city measures for controlling provincial outbreaks.
Methods Note
All analyses used negative binomial panel regression with location and time fixed effects. Coefficients represent elasticities (log-log specification). Statistical significance levels: p<0.001, p<0.01, p<0.05.
Abbreviations: AIC = Akaike Information Criterion (lower is better); SE = Standard Error; RR = Relative Risk; IR = Importation Risk.
CDR Data Source: Kiang et al. (2021), roughly 11 million mobile phone subscribers, August-October 2017.
Analysis period: 2003-2024.