Mixture Models

A mixture model is a probability distribution which, given a finite $k > 0$, samples from $k$ different distributions $\{f_i(x) | i \in \{1,...,k\}\}$ randomly, where the probability of sampling from $f_i(x)$ is $\pi_i$. Generally, a mixture model is written in the form of:

\[f_{mix}(x; \Theta, \pi) = \sum_{k=1}^K \pi_k f_k(x)\]

Where $f_i(x)$ is called the ith component and $\pi_i$ is called the ith mixing coeffiecent.

Gaussian Mixture Model

StateSpaceDynamics.GaussianMixtureModelType
GaussianMixtureModel

A Gaussian Mixture Model for clustering and density estimation.

Fields

  • k::Int: Number of clusters.
  • μₖ::Matrix{<:Real}: Means of each cluster (dimensions: data_dim x k).
  • Σₖ::Array{Matrix{<:Real}, 1}: Covariance matrices of each cluster.
  • πₖ::Vector{Float64}: Mixing coefficients for each cluster.

Examples

gmm = GaussianMixtureModel(3, 2) # Create a Gaussian Mixture Model with 3 clusters and 2-dimensional data
fit!(gmm, data)
StateSpaceDynamics.GaussianMixtureModelMethod
GaussianMixtureModel(k::Int, data_dim::Int)

Constructor for GaussianMixtureModel. Initializes Σₖ's covariance matrices to the identity, πₖ to a uniform distribution, and μₖ's means to zeros.

StateSpaceDynamics.fit!Method
fit!(gmm::GaussianMixtureModel, data::Matrix{<:Real}; <keyword arguments>)

Fits a Gaussian Mixture Model (GMM) to the given data using the Expectation-Maximization (EM) algorithm.

Arguments

  • gmm::GaussianMixtureModel: The Gaussian Mixture Model to be fitted.
  • data::Matrix{<:Real}: The dataset on which the model will be fitted, where each row represents a data point.
  • maxiter::Int=50: The maximum number of iterations for the EM algorithm (default: 50).
  • tol::Float64=1e-3: The tolerance for convergence. The algorithm stops if the change in log-likelihood between iterations is less than this value (default: 1e-3).
  • initialize_kmeans::Bool=false: If true, initializes the means of the GMM using K-means++ initialization (default: false).

Returns

  • class_probabilities: A matrix where each entry (i, k) represents the probability of the i-th data point belonging to the k-th component of the mixture model.

Example

data = rand(2, 100)  # Generate some random data
gmm = GaussianMixtureModel(k=3, d=2)  # Initialize a GMM with 3 components and 2-dimensional data
class_probabilities = fit!(gmm, data, maxiter=100, tol=1e-4, initialize_kmeans=true)
StateSpaceDynamics.log_likelihoodMethod
log_likelihood(gmm::GaussianMixtureModel, data::Matrix{<:Real})

Compute the log-likelihood of the data given the Gaussian Mixture Model (GMM). The data matrix should be of shape (# observations, # features).

Returns

  • Float64: The log-likelihood of the data given the model.

Poisson Mixture Model

StateSpaceDynamics.PoissonMixtureModelType
PoissonMixtureModel

A Poisson Mixture Model for clustering and density estimation.

Fields

  • k::Int: Number of poisson-distributed clusters.
  • λₖ::Vector{Float64}: Means of each cluster.
  • πₖ::Vector{Float64}: Mixing coefficients for each cluster.

Examples

julia pmm = PoissonMixtureModel(3) # 3 clusters, 2-dimensional data fit!(pmm, data)

StateSpaceDynamics.fit!Method
fit!(pmm::PoissonMixtureModel, data::Matrix{Int}; <keyword arguments>)

Fits a Poisson Mixture Model (PMM) to the given data using the Expectation-Maximization (EM) algorithm.

Arguments

  • pmm::PoissonMixtureModel: The Poisson Mixture Model to be fitted.
  • data::Matrix{Int}: The dataset on which the model will be fitted, where each row represents a data point.
  • maxiter::Int=50: The maximum number of iterations for the EM algorithm (default: 50).
  • tol::Float64=1e-3: The tolerance for convergence. The algorithm stops if the change in log-likelihood between iterations is less than this value (default: 1e-3).
  • initialize_kmeans::Bool=false: If true, initializes the means of the PMM using K-means++ initialization (default: false).

Returns

  • class_probabilities: A matrix where each entry (i, k) represents the probability of the i-th data point belonging to the k-th component of the mixture model.

Example

data = rand(1:10, 100, 1)  # Generate some random integer data
pmm = PoissonMixtureModel(k=3)  # Initialize a PMM with 3 components
class_probabilities = fit!(pmm, data, maxiter=100, tol=1e-4, initialize_kmeans=true)
StateSpaceDynamics.log_likelihoodMethod
log_likelihood(pmm::PoissonMixtureModel, data::Matrix{Int})

Compute the log-likelihood of the data given the Poisson Mixture Model (PMM). The data matrix should be of shape (# observations, # features).

Returns

  • Float64: The log-likelihood of the data given the model.
StateSpaceDynamics.sampleMethod
sample(pmm::PoissonMixtureModel, n)

Draw 'n' samples from pmm. Returns a Vector{Int} of length n.