What is a Linear Dynamical System?

A Linear Dynamical System (LDS) is a mathematical model used to describe how a system evolves over time. These systems are a subset of state-space models, where the hidden state dynamics are continuous. What makes these models linear is that the latent dynamics evolve according to a linear function of the previous state. The observations, however, can be related to the hidden state through a nonlinear link function.

At its core, an LDS defines:

  • A state transition function: how the internal state evolves from one time step to the next.
  • An observation function: how the internal state generates the observed data.
StateSpaceDynamics.LinearDynamicalSystemType
LinearDynamicalSystem{T<:Real, S<:AbstractStateModel{T}, O<:AbstractObservationModel{T}}

Represents a unified Linear Dynamical System with customizable state and observation models.

Fields

  • state_model::S: The state model (e.g., GaussianStateModel)
  • obs_model::O: The observation model (e.g., GaussianObservationModel or PoissonObservationModel)
  • latent_dim::Int: Dimension of the latent state
  • obs_dim::Int: Dimension of the observations
  • fit_bool::Vector{Bool}: Vector indicating which parameters to fit during optimization
source

The Gaussian Linear Dynamical System

The Gaussian Linear Dynamical System — typically just referred to as an LDS — is a specific type of linear dynamical system where both the state transition and observation functions are linear, and all noise is Gaussian.

The generative model is given by:

\[\begin{aligned} x_t &\sim \mathcal{N}(A x_{t-1}, Q) \\ y_t &\sim \mathcal{N}(C x_t, R) \end{aligned}\]

Where:

  • x_t is the hidden state at time t
  • y_t is the observed data at time t
  • A is the state transition matrix
  • C is the observation matrix
  • Q is the process noise covariance
  • R is the observation noise covariance

This can equivalently be written in equation form:

\[\begin{aligned} x_t &= A x_{t-1} + \epsilon_t \\ y_t &= C x_t + \eta_t \end{aligned}\]

Where:

  • ε_t ~ N(0, Q) is the process noise
  • η_t ~ N(0, R) is the observation noise
StateSpaceDynamics.GaussianStateModelType
GaussianStateModel{T<:Real. M<:AbstractMatrix{T}, V<:AbstractVector{T}}}

Represents the state model of a Linear Dynamical System with Gaussian noise.

Fields

  • A::M: Transition matrix (size latent_dim×latent_dim).
  • Q::M: Process noise covariance matrix
  • x0::V: Initial state vector (length latent_dim).
  • P0::M: Initial state covariance matrix (size `latentdim×latentdim
source
StateSpaceDynamics.GaussianObservationModelType
GaussianObservationModel{T<:Real, M<:AbstractMatrix{T}}

Represents the observation model of a Linear Dynamical System with Gaussian noise.

Fields

  • C::M: Observation matrix of size (obs_dim × latent_dim). Maps latent states into observation space.
  • R::M: Observation noise covariance of size (obs_dim × obs_dim).
source

The Poisson Linear Dynamical System

The Poisson Linear Dynamical System is a variant of the LDS where the observations are modeled as counts. This is useful in fields like neuroscience where we are often interested in modeling spike count data. To relate the spiking data to the Gaussian latent variable, we use a nonlinear link function, specifically the exponential function.

The generative model is given by:

\[\begin{aligned} x_t &\sim \mathcal{N}(A x_{t-1}, Q) \\ y_t &\sim \text{Poisson}(\exp(Cx_t + b)) \end{aligned}\]

Where b is a bias term.

StateSpaceDynamics.PoissonObservationModelType
PoissonObservationModel{T<:Real, M<:AbstractMatrix{T}, V<:AbstractVector{T}} <: AbstractObservationModel{T}

Represents the observation model of a Linear Dynamical System with Poisson observations.

Fields

  • C::AbstractMatrix{T}: Observation matrix of size (obs_dim × latent_dim). Maps latent states into observation space.
  • log_d::AbstractVector{T}: Mean firing rate vector (log space) of size (obs_dim × obs_dim).
source

Sampling from Linear Dynamical Systems

You can generate synthetic data from fitted LDS models:

Base.randMethod
Random.rand(lds::LinearDynamicalSystem; tsteps::Int, ntrials::Int)
Random.rand(rng::AbstractRNG, lds::LinearDynamicalSystem; tsteps::Int, ntrials::Int)

Sample from a Linear Dynamical System.

source

Inference in Linear Dynamical Systems

In StateSpaceDynamics.jl, we directly maximize the complete-data log-likelihood function with respect to the latent states given the data and the parameters of the model. In other words, the maximum a priori (MAP) estimate of the latent state path is:

\[\underset{x}{\text{argmax}} \left[ \log p(x_0) + \sum_{t=2}^T \log p(x_t \mid x_{t-1}) + \sum_{t=1}^T \log p(y_t \mid x_t) \right]\]

This MAP estimation approach has the same computational complexity as traditional Kalman filtering and smoothing — $\mathcal{O}(T)$ — but is significantly more flexible. Notably, it can handle nonlinear observations and non-Gaussian noise while still yielding exact MAP estimates, unlike approximate techniques such as the Extended Kalman Filter (EKF) or Unscented Kalman Filter (UKF).

Newton's Method for Latent State Optimization

To find the MAP trajectory, we iteratively optimize the latent states using Newton's method. The update equation at each iteration is:

\[x^{(i+1)} = x^{(i)} - \left[ \nabla^2 \mathcal{L}(x^{(i)}) \right]^{-1} \nabla \mathcal{L}(x^{(i)})\]

Where:

  • $\mathcal{L}(x)$ is the complete-data log-likelihood:

\[\mathcal{L}(x) = \log p(x_0) + \sum_{t=2}^T \log p(x_t \mid x_{t-1}) + \sum_{t=1}^T \log p(y_t \mid x_t)\]

  • $\nabla \mathcal{L}(x)$ is the gradient of the full log-likelihood with respect to all latent states
  • $\nabla^2 \mathcal{L}(x)$ is the Hessian of the full log-likelihood

This update is performed over the entire latent state sequence $x_{1:T}$, and repeated until convergence.

For Gaussian models, $\mathcal{L}(x)$ is quadratic and Newton's method converges in a single step — recovering the exact Kalman smoother solution. For non-Gaussian models, the Hessian is not constant and the optimization is more complex. However, the MAP estimate can still be computed efficiently using the same approach as the optimization problem is still convex.

StateSpaceDynamics.smoothFunction
smooth(lds, y)

This function performs direct smoothing for a linear dynamical system (LDS) given the system parameters and the observed data for a single trial

Arguments

  • lds::LinearDynamicalSystem{T,S,O}: The LDS object representing the system parameters.
  • y::AbstractMatrix{T}: The observed data matrix.
  • w::Union{Nothing,AbstractVector{T}}: coeffcients to weight the data.

Returns

  • x::AbstractMatrix{T}: The optimal state estimate.
  • p_smooth::Array{T, 3}: The posterior covariance matrix.
  • inverse_offdiag::Array{T, 3}: The inverse off-diagonal matrix.
  • Q_val::T: The Q-function value.
source
smooth(lds, y)

This function performs direct smoothing for a linear dynamical system (LDS) given the system parameters and the observed data for multiple trials.

Arguments

  • lds::LinearDynamicalSystem{T,S,O}: The LDS object representing the system parameters.
  • y::AbstractArray{T,3}: The observed data array with dimensions (obs_dim, tsteps, ntrials).

Returns

  • x::AbstractArray{T,3}: The optimal state estimates with dimensions (ntrials, tsteps, latent_dim).
  • p_smooth::AbstractArray{T,4}: The posterior covariance matrices with dimensions (latentdim, latentdim, tsteps, ntrials).
  • inverse_offdiag::AbstractArray{T,4}: The inverse off-diagonal matrices with dimensions (latentdim, latentdim, tsteps, ntrials).
source

Laplace Approximation of Posterior for Non-Conjugate Observation Models

In the case of non-Gaussian observations, we can use a Laplace approximation to compute the posterior distribution of the latent states. For Gaussian observations (which are conjugate with the Gaussian state model), the posterior is also Gaussian and is the exact posterior. However, for non-Gaussian observations, we can approximate the posterior using a Gaussian distribution centered at the MAP estimate of the latent states. This approximation is given by:

\[p(x \mid y) \approx \mathcal{N}(x^{*}, -\left[ \nabla^2 \mathcal{L}(x^{*}) \right]^{-1})\]

Where:

  • $x^{*}$ is the MAP estimate of the latent states
  • $\nabla^2 \mathcal{L}(x^{*})$ is the Hessian of the log-likelihood at the MAP estimate

Despite the requirement of inverting a Hessian of dimension $(d \times T) \times (d \times T)$, this is still computationally efficient, as the Markov structure of the model renders the Hessian block-tridiagonal, and thus the inversion is tractable.

Learning in Linear Dynamical Systems

Given the latent structure of state-space models, we must rely on either the Expectation-Maximization (EM) or Variational Inference (VI) approaches to learn the parameters of the model. StateSpaceDynamics.jl supports both EM and VI. For LDS models, we can use Laplace EM, where we approximate the posterior of the latent state path using the Laplace approximation as outlined above. Using these approximate posteriors (or exact ones in the Gaussian case), we can apply closed-form updates for the model parameters.

StateSpaceDynamics.fit!Method
fit!(lds, y; max_iter::Int=1000, tol::Real=1e-12) 
where {T<:Real, S<:GaussianStateModel{T}, O<:GaussianObservationModel{T}}

Fit a Linear Dynamical System using the Expectation-Maximization (EM) algorithm with Kalman smoothing over multiple trials

Arguments

  • lds::LinearDynamicalSystem{T,S,O}: The Linear Dynamical System to be fitted.
  • y::AbstractArray{T,3}: Observed data, size(obsdim, Tsteps, n_trials)

Keyword Arguments

  • max_iter::Int=1000: Maximum number of EM iterations.
  • tol::T=1e-12: Convergence tolerance for log-likelihood change.

Returns

  • mls::Vector{T}: Vector of log-likelihood values for each iteration.
  • param_diff::Vector{T}: Vector of parameter deltas for each EM iteration.
source