Structural Time Series models
using the Kalman Filter assimilation
Structural Time Series (STS) models provide a modular, interpretable, and probabilistic framework for time series analysis. Their formulation as state-space models enables efficient Kalman filter based (KF) data assimilation, allowing simultaneous estimation, forecasting, and structural decomposition with quantified uncertainty.
We applied STS with KF assimilation to improve the dengue model forecasting capacity compared with the ARIMA family of models, which are still among the most commonly used approaches. STS models are more flexible and can approximate the nonlinear behaviour typical of epidemiological and climatic time series better than many classical linear frameworks. Both linear and nonlinear variants can be used within this project, depending on the application.
1. Definition
Structural Time Series models decompose an observed time series (e.g., dengue, or local rainfall) into interpretable latent components, typically:
where:
Each component is modeled explicitly as a stochastic process, allowing for time-varying dynamics and structural change. Model estimators already inform whether certain parameters should be fixed and invariable or should be allowed to change.
2. State-Space Formulation
STS models are naturally expressed in linear Gaussian state-space form, enabling Kalman filtering and smoothing, namely:
2.1 Observation Equation
where:
2.2 State Transition equation
where:
This formulation supports time-varying parameters, missing data, and irregular sampling.
3. Canonical Structural Components
3.1 Local Level and Trend
Local level model:
Local linear trend:
This corresponds to an integrated random walk with stochastic slope.
3.2 Seasonal Component
Seasonality is typically modelled as a constrained stochastic cycle:
Implemented using a companion-form state vector of dimension s-1. This approach proved very useful when dealing with the typical seasonality present in both climatic and epidemiological datasets.
3.3 Cyclical Component
We often represent it as a damped stochastic cycle:
where:
In our framework for Thailand, we are using these cycles to account for periodicities seen in dengue epidemics and in their potential climatic drivers, having cycles beyond the seasonal one (i.e. interannual, such as the TBO, and the ENSO events, characterised by periods of between 3 and 7 years).
4. Kalman Filter as Data Assimilation Engine
The Kalman filter provides optimal (i.e. minimum mean-square error) sequential data assimilation under linear-Gaussian assumptions. Process operates as follows:
4.1 Prediction step:
4.2 Update step:
This recursive mechanism assimilates observations as they arrive, handling missing data naturally.
5. Smoothing and Structural Inference
Post-filtering, Kalman smoothing (e.g. Rauch-Tung-Striebel smoother) yields:
allowing:
- retrospective component estimation
- signal-to-noise separation
- structural break detection
- decomposition uncertainty quantification
6. Parameter Estimation
Unknown variances and structural parameters are typically estimated via:
- Maximum Likelihood (MLE) using the prediction-error decomposition
- Expectation-Maximization(EM) algorithms
- Bayesian inference (priors on variance components)
The log-likelihood is computed directly from Kalman filter innovations.
Other considerations
As known, many ARIMA models are reparameterizations of STS models but:
- STS offers greater interpretability and modularity
- Bayesian Structural Time Series (BSTS) may extend this framework with priors and spike-and-slab regression.
- STS can optimally cope better with nonlinearities in the system to model. This is useful at a time when dengue epidemiology is changing in many places in the world due to massive anthropogenic influences (direct and indirect) as well as the ongoing climate change. Therefore statistical tools that mostly based on what has been observed in the past and are blind to understand mechanisms underlying those forcing, are not that suited to develop EWS for future changes in dengue, out of the stationary domain.
Strengths and Limitations
Strengths
- Principled data assimilation
- Handles missing/irregular data
- Explicit uncertainty propagation
- Interpretable latent structure
Limitations
- Linear-Gaussian assumption
- State dimension grows with structural complexity
- Nonlinear extensions require EKF/ UKF or particle filters