Probabilistic Models on Matrix Manifolds

This article looks at some parametric families of probability distributions on matrix manifolds. The default reference is [@Chikuse2003]; other main references include [@Dryden2016; @Bhattacharya2012; @Patrangenaru2015; @Srivastava2016]. In this article, |A| denotes the determinant of a square matrix A; $N_n$ denotes the distribution of an n-dimensional Gaussian random vector.

Matrix manifolds

Matrix-variate standard Gaussian distribution $N_{n,k}(0; I_n, I_k)$ on the n-by-k matrix manifold $M_{n,k}$ [Sec 1.5] is the nk-dimensional standard Gaussian vector stacked into a matrix: let $\text{vec}(Z) \sim N_{nk}(0, I_{nk})$, then $Z \sim N_{n,k}(0; I_n, I_k)$. PDF: (1) $\phi(Z) = z^{-1} \exp(-\text{tr}(Z^T Z) / 2)$, where normalizing constant $z = (2 \pi)^{kn/2}$; (2) because the Frobenius norm is related to the trace via $\|Z\|_F^2 = \text{tr}(Z^T Z)$, we have $\phi(Z) = z^{-1} \exp(-\|Z\|_F^2 / 2)$.

Matrix-variate Gaussian distribution $N_{n,k}(M; \Sigma_n, \Sigma_k)$ on the n-by-k matrix manifold, where $M \in M_{n,k}$, $\Sigma_n \in \mathcal{S}_+(n)$, and $\Sigma_k \in \mathcal{S}_+(k)$, is the distribution of a transformed matrix-variate standard Gaussian: $Y = \Sigma_n^{1/2} Z \Sigma_k^{1/2} + M$, where $Z \sim N_{n,k}(0; I_n, I_k)$. This can be written in a vectorized form as: $\text{vec}(Y) = (\Sigma_k^{1/2} \otimes \Sigma_n^{1/2}) \text{vec}(Z) + \text{vec}(M)$, where $\otimes$ is the Kronecker product. Note that this is a special form of the nk-dimensional Gaussian vector $N_{nk}(\mu, \Sigma)$, where covariance matrix $\Sigma = \Sigma_k \otimes \Sigma_n \in \mathcal{S}_+(nk)$. PDF: (1) relation with the matrix-variate standard Gaussian PDF: $p(Y; M, \Sigma_n, \Sigma_k) = |\Sigma_n|^{-k/2} |\Sigma_k|^{-n/2} \phi(\Sigma_n^{-1/2} (Y - M) \Sigma_k^{-1/2})$; (2) explicit form: $p(Y; M, \Sigma_n, \Sigma_k) = z^{-1} \exp\{-\text{tr}[\Sigma_n^{-1} (Y - M) \Sigma_k^{-1} (Y - M)^T] / 2\}$, where normalizing constant $z = (2 \pi)^{kn/2} |\Sigma_n|^{k/2} |\Sigma_k|^{n/2}$.

Positive-definite manifold

Wishart distribution $W_n(m, \Sigma)$ on positive-definite manifold $\mathcal{S}_+(n)$, where degree of freedom $m \in \{n, n+1, \cdots\}$ and covariance matrix $\Sigma \in \mathcal{S}_+(n)$, is the distribution of the covariance matrix of m Gaussian random vectors [@Wishart1928]: if $Y = \{y_i\}_{i=1}^m$ is a random sample of $y \sim N_n(0, \Sigma)$, then $Y Y^T \sim W_n(m, \Sigma)$. Notice that $Y \sim N_{n,m}(0; \Sigma, I_m)$ and $Y^T \sim N_{m,n}(0; I_m, \Sigma)$ are special forms of matrix-variate Gaussians. PDF: (1) $p(S; m, \Sigma) = z^{-1} |S|^{(m-n-1)/2} \exp(-\text{tr}(\Sigma^{-1} S)/2)$, where normalizing constant $z = 2^{mn/2} \Gamma_n(m/2) |\Sigma|^{m/2}$; (2) $p(S; m, \Sigma) = z^{-1} |S|^{(m-n-1)/2} \exp(-\|\Sigma^{-1/2} S^{1/2}\|_F^2 / 2)$.

Noncentral Wishart distribution $W_n(m, \Sigma; \Omega)$. [@Muirhead1982]

(An unnamed family of distributions)

Symmetric manifold

Symmetric standard Gaussian distribution $N_{n,n}(0; I_n)$ on symmetric matrix manifold $\mathcal{S}(n)$.

Symmetric Gaussian distribution $N_{n,n}(M; \Sigma)$

Spheres

von Mises distribution on the 1-shpere (i.e. circle) is a closed-form approximation to the wrapped Gaussian distribution. von Mises–Fisher distribution (vMF) [@Fisher1953] generalizes the von Mises distribution to n-spheres, and is the most commonly used distribution in directional/circular/spherical statistics: $f(x; a, b) \propto \exp(b \cos(x-a))$, where $x, a \in [0, 2\pi), b \ge 0$.

Bingham [@Bingham1974]

Kent [@Kent1982]

Angular central Gaussian (ACG)

Bivariate von Mises on the 2-torus [@Mardia1975]

Stiefel manifold

Matrix Langevin $L_{k,n}(F)$, aka matrix von Mises-Fisher (matrix vMF), is an exponential-linear family. von Mises-Fisher (vMF) for n-spheres and matrix vMF for Stiefel manifolds are in the larger family of maximum entropy distributions with moment constraints [@Pennec2006].

Matrix Bingham $B_{k,n}(B)$, exponential-quadratic;

Matrix angular central Gaussian $\text{MACG}(\Sigma)$;

(An unnamed general family)

Sampling: geodesic Monte Carlo [@Byrne2013] is applicable.

Grassmann manifold

Matrix Langevin $L^{(P)}_{k,n}(B)$, uses trace.

Matrix angular central Gaussian $\text{MACG}(\Sigma)$, uses (order-k) determinant.

Orthogonal projective Gaussian $OPG(B)$

(An unnamed family of distributions)

(An unnamed general family)

Misc

Riemannian symmetric spaces: Gaussian-like distribution $f(p) \propto \exp[- d^2(p, \mu) / (2 \sigma^2)]$ [@Said2018],

positive definite matrices [@Said2017] with a certain structure, e.g. complex, Toeplitz, or block-Toeplitz;
Grassmannian

Heat kernels:

1-sphere: wrapped Gaussian.

Wrapped Gaussian distribution on the 1-sphere is its heat kernel, which "wraps" the Gaussian distribution around the unit circle: $f_{WN} (\theta; \mu, \sigma) = \sum_{k \in \mathbb{Z}} \exp[-(\theta-\mu+2\pi k)^2/(2\sigma^2)] / \sqrt{2 \pi \sigma^2}$. Heat kernels of complete Riemannian manifolds do not have closed forms in general, see Geometric Diffusion.

🏷 Category=Statistics