What is a Linear Dynamical System?
A Linear Dynamical System (LDS) is a mathematical model used to describe how a system evolves over time. These systems are a subset of state-space models, where the hidden state dynamics are continuous. What makes these models linear is that the latent dynamics evolve according to a linear function of the previous state. The observations, however, can be related to the hidden state through a nonlinear link function.
At its core, an LDS defines:
- A state transition function: how the internal state evolves from one time step to the next.
- An observation function: how the internal state generates the observed data.
StateSpaceDynamics.LinearDynamicalSystem
— TypeLinearDynamicalSystem{T<:Real, S<:AbstractStateModel{T}, O<:AbstractObservationModel{T}}
Represents a unified Linear Dynamical System with customizable state and observation models.
Fields
state_model::S
: The state model (e.g., GaussianStateModel)obs_model::O
: The observation model (e.g., GaussianObservationModel or PoissonObservationModel)latent_dim::Int
: Dimension of the latent stateobs_dim::Int
: Dimension of the observationsfit_bool::Vector{Bool}
: Vector indicating which parameters to fit during optimization
The Gaussian Linear Dynamical System
The Gaussian Linear Dynamical System — typically just referred to as an LDS — is a specific type of linear dynamical system where both the state transition and observation functions are linear, and all noise is Gaussian.
The generative model is given by:
\[\begin{aligned} x_t &\sim \mathcal{N}(A x_{t-1}, Q) \\ y_t &\sim \mathcal{N}(C x_t, R) \end{aligned}\]
Where:
x_t
is the hidden state at timet
y_t
is the observed data at timet
A
is the state transition matrixC
is the observation matrixQ
is the process noise covarianceR
is the observation noise covariance
This can equivalently be written in equation form:
\[\begin{aligned} x_t &= A x_{t-1} + \epsilon_t \\ y_t &= C x_t + \eta_t \end{aligned}\]
Where:
ε_t ~ N(0, Q)
is the process noiseη_t ~ N(0, R)
is the observation noise
StateSpaceDynamics.GaussianStateModel
— TypeGaussianStateModel{T<:Real. M<:AbstractMatrix{T}, V<:AbstractVector{T}}}
Represents the state model of a Linear Dynamical System with Gaussian noise.
Fields
A::M
: Transition matrix (sizelatent_dim×latent_dim
).Q::M
: Process noise covariance matrixx0::V
: Initial state vector (lengthlatent_dim
).P0::M
: Initial state covariance matrix (size `latentdim×latentdim
StateSpaceDynamics.GaussianObservationModel
— TypeGaussianObservationModel{T<:Real, M<:AbstractMatrix{T}}
Represents the observation model of a Linear Dynamical System with Gaussian noise.
Fields
C::M
: Observation matrix of size(obs_dim × latent_dim)
. Maps latent states into observation space.R::M
: Observation noise covariance of size(obs_dim × obs_dim)
.
The Poisson Linear Dynamical System
The Poisson Linear Dynamical System is a variant of the LDS where the observations are modeled as counts. This is useful in fields like neuroscience where we are often interested in modeling spike count data. To relate the spiking data to the Gaussian latent variable, we use a nonlinear link function, specifically the exponential function.
The generative model is given by:
\[\begin{aligned} x_t &\sim \mathcal{N}(A x_{t-1}, Q) \\ y_t &\sim \text{Poisson}(\exp(Cx_t + b)) \end{aligned}\]
Where b
is a bias term.
StateSpaceDynamics.PoissonObservationModel
— TypePoissonObservationModel{T<:Real, M<:AbstractMatrix{T}, V<:AbstractVector{T}} <: AbstractObservationModel{T}
Represents the observation model of a Linear Dynamical System with Poisson observations.
Fields
C::AbstractMatrix{T}
: Observation matrix of size(obs_dim × latent_dim)
. Maps latent states into observation space.log_d::AbstractVector{T}
: Mean firing rate vector (log space) of size(obs_dim × obs_dim)
.
Sampling from Linear Dynamical Systems
You can generate synthetic data from fitted LDS models:
Base.rand
— MethodRandom.rand(lds::LinearDynamicalSystem; tsteps::Int, ntrials::Int)
Random.rand(rng::AbstractRNG, lds::LinearDynamicalSystem; tsteps::Int, ntrials::Int)
Sample from a Linear Dynamical System.
Inference in Linear Dynamical Systems
In StateSpaceDynamics.jl, we directly maximize the complete-data log-likelihood function with respect to the latent states given the data and the parameters of the model. In other words, the maximum a priori (MAP) estimate of the latent state path is:
\[\underset{x}{\text{argmax}} \left[ \log p(x_0) + \sum_{t=2}^T \log p(x_t \mid x_{t-1}) + \sum_{t=1}^T \log p(y_t \mid x_t) \right]\]
This MAP estimation approach has the same computational complexity as traditional Kalman filtering and smoothing — $\mathcal{O}(T)$ — but is significantly more flexible. Notably, it can handle nonlinear observations and non-Gaussian noise while still yielding exact MAP estimates, unlike approximate techniques such as the Extended Kalman Filter (EKF) or Unscented Kalman Filter (UKF).
Newton's Method for Latent State Optimization
To find the MAP trajectory, we iteratively optimize the latent states using Newton's method. The update equation at each iteration is:
\[x^{(i+1)} = x^{(i)} - \left[ \nabla^2 \mathcal{L}(x^{(i)}) \right]^{-1} \nabla \mathcal{L}(x^{(i)})\]
Where:
- $\mathcal{L}(x)$ is the complete-data log-likelihood:
\[\mathcal{L}(x) = \log p(x_0) + \sum_{t=2}^T \log p(x_t \mid x_{t-1}) + \sum_{t=1}^T \log p(y_t \mid x_t)\]
- $\nabla \mathcal{L}(x)$ is the gradient of the full log-likelihood with respect to all latent states
- $\nabla^2 \mathcal{L}(x)$ is the Hessian of the full log-likelihood
This update is performed over the entire latent state sequence $x_{1:T}$, and repeated until convergence.
For Gaussian models, $\mathcal{L}(x)$ is quadratic and Newton's method converges in a single step — recovering the exact Kalman smoother solution. For non-Gaussian models, the Hessian is not constant and the optimization is more complex. However, the MAP estimate can still be computed efficiently using the same approach as the optimization problem is still convex.
StateSpaceDynamics.smooth
— Functionsmooth(lds, y)
This function performs direct smoothing for a linear dynamical system (LDS) given the system parameters and the observed data for a single trial
Arguments
lds::LinearDynamicalSystem{T,S,O}
: The LDS object representing the system parameters.y::AbstractMatrix{T}
: The observed data matrix.w::Union{Nothing,AbstractVector{T}}
: coeffcients to weight the data.
Returns
x::AbstractMatrix{T}
: The optimal state estimate.p_smooth::Array{T, 3}
: The posterior covariance matrix.inverse_offdiag::Array{T, 3}
: The inverse off-diagonal matrix.Q_val::T
: The Q-function value.
smooth(lds, y)
This function performs direct smoothing for a linear dynamical system (LDS) given the system parameters and the observed data for multiple trials.
Arguments
lds::LinearDynamicalSystem{T,S,O}
: The LDS object representing the system parameters.y::AbstractArray{T,3}
: The observed data array with dimensions (obs_dim, tsteps, ntrials).
Returns
x::AbstractArray{T,3}
: The optimal state estimates with dimensions (ntrials, tsteps, latent_dim).p_smooth::AbstractArray{T,4}
: The posterior covariance matrices with dimensions (latentdim, latentdim, tsteps, ntrials).inverse_offdiag::AbstractArray{T,4}
: The inverse off-diagonal matrices with dimensions (latentdim, latentdim, tsteps, ntrials).
Laplace Approximation of Posterior for Non-Conjugate Observation Models
In the case of non-Gaussian observations, we can use a Laplace approximation to compute the posterior distribution of the latent states. For Gaussian observations (which are conjugate with the Gaussian state model), the posterior is also Gaussian and is the exact posterior. However, for non-Gaussian observations, we can approximate the posterior using a Gaussian distribution centered at the MAP estimate of the latent states. This approximation is given by:
\[p(x \mid y) \approx \mathcal{N}(x^{*}, -\left[ \nabla^2 \mathcal{L}(x^{*}) \right]^{-1})\]
Where:
- $x^{*}$ is the MAP estimate of the latent states
- $\nabla^2 \mathcal{L}(x^{*})$ is the Hessian of the log-likelihood at the MAP estimate
Despite the requirement of inverting a Hessian of dimension $(d \times T) \times (d \times T)$, this is still computationally efficient, as the Markov structure of the model renders the Hessian block-tridiagonal, and thus the inversion is tractable.
Learning in Linear Dynamical Systems
Given the latent structure of state-space models, we must rely on either the Expectation-Maximization (EM) or Variational Inference (VI) approaches to learn the parameters of the model. StateSpaceDynamics.jl supports both EM and VI. For LDS models, we can use Laplace EM, where we approximate the posterior of the latent state path using the Laplace approximation as outlined above. Using these approximate posteriors (or exact ones in the Gaussian case), we can apply closed-form updates for the model parameters.
StateSpaceDynamics.fit!
— Methodfit!(lds, y; max_iter::Int=1000, tol::Real=1e-12)
where {T<:Real, S<:GaussianStateModel{T}, O<:GaussianObservationModel{T}}
Fit a Linear Dynamical System using the Expectation-Maximization (EM) algorithm with Kalman smoothing over multiple trials
Arguments
lds::LinearDynamicalSystem{T,S,O}
: The Linear Dynamical System to be fitted.y::AbstractArray{T,3}
: Observed data, size(obsdim, Tsteps, n_trials)
Keyword Arguments
max_iter::Int=1000
: Maximum number of EM iterations.tol::T=1e-12
: Convergence tolerance for log-likelihood change.
Returns
mls::Vector{T}
: Vector of log-likelihood values for each iteration.param_diff::Vector{T}
: Vector of parameter deltas for each EM iteration.