This article looks at some parametric families of probability distributions on matrix manifolds. The default reference is [@Chikuse2003]; other main references include [@Dryden2016; @Bhattacharya2012; @Patrangenaru2015; @Srivastava2016]. In this article, |A| denotes the determinant of a square matrix A; $N_n$ denotes the distribution of an n-dimensional Gaussian random vector; (dX) denotes the Lebesgue measures for rectangular, symmetric, or skew-symmetric matrices, or the unnormalized invariant measure on the Stiefel or Grassmann manifold; [dX] denotes the normalized invariant measure on the Stiefel or Grassmann manifold. Following [@Chikuse2003], all PDFs on the Stiefel or Grassmann manifold are defined w.r.t. the normalized measures.

Left and right orthogonal invariance... $M \in M_{m,n}$, $p(M) = p(M Q)$, $\forall Q \in O(n)$. See orthogonal-invariant function.

**Matrix-variate standard Gaussian** distribution $N_{n,k}(0; I_n, I_k)$
on the n-by-k matrix manifold $M_{n,k}$ [Sec 1.5]
is the nk-dimensional standard Gaussian vector stacked into a matrix:
let $\text{vec}(Z) \sim N_{nk}(0, I_{nk})$, then $Z \sim N_{n,k}(0; I_n, I_k)$.
PDF: (1) $\phi(Z) = z^{-1} \exp(-\text{tr}(Z^T Z) / 2)$,
where normalizing constant $z = (2 \pi)^{kn/2}$;
(2) because the Frobenius norm is related to the trace via $\|Z\|_F^2 = \text{tr}(Z^T Z)$,
we have $\phi(Z) = z^{-1} \exp(-\|Z\|_F^2 / 2)$.
Its characteristic function is also log-quadratic: $\Phi_Z(T) = \exp(-\text{tr}(T^T T)/2)$.

**Matrix-variate Gaussian** distribution $N_{n,k}(M; \Sigma_n, \Sigma_k)$
on the n-by-k matrix manifold,
where $M \in M_{n,k}$, $\Sigma_n \in \mathcal{S}_+(n)$, and $\Sigma_k \in \mathcal{S}_+(k)$,
is the distribution of a transformed matrix-variate standard Gaussian:
$Y = \Sigma_n^{1/2} Z \Sigma_k^{1/2} + M$, where $Z \sim N_{n,k}(0; I_n, I_k)$.
This can be written in a vectorized form as:
$\text{vec}(Y) = (\Sigma_k^{1/2} \otimes \Sigma_n^{1/2}) \text{vec}(Z) + \text{vec}(M)$,
where $\otimes$ is the Kronecker product.
Note that this is a special form of the nk-dimensional Gaussian vector $N_{nk}(\mu, \Sigma)$,
where covariance matrix $\Sigma = \Sigma_k \otimes \Sigma_n \in \mathcal{S}_+(nk)$.
PDF: (1) relation with the matrix-variate standard Gaussian PDF:
$p(Y; M, \Sigma_n, \Sigma_k) = |\Sigma_n|^{-k/2} |\Sigma_k|^{-n/2}
\phi(\Sigma_n^{-1/2} (Y - M) \Sigma_k^{-1/2})$;
(2) explicit form: $p(Y; M, \Sigma_n, \Sigma_k) = z^{-1} \exp\{-\text{tr}[\Sigma_n^{-1} (Y - M)
\Sigma_k^{-1} (Y - M)^T] / 2\}$,
where normalizing constant $z = (2 \pi)^{kn/2} |\Sigma_n|^{k/2} |\Sigma_k|^{n/2}$.

**Symmetric standard Gaussian** distribution $N_{n,n}(0; I_n)$
on symmetric matrix manifold $\mathcal{S}(n)$
is the matrix-variate standard Gaussian restricted to the symmetric matrix manifold:
let $M \sim N_{n,n}(0; I_n, I_n)$, then $M|_{\mathcal{S}(n)} \sim N_{n,n}(0; I_n)$.
PDF: (1) $\phi(S) = z^{-1} \exp(-\text{tr}(S^2)/2)$,
where normalizing constant $z = 2^{n/2} \pi^{n(n+1)/4}$;
(2) norm version, $\phi(S) = z^{-1} \exp(-\|S\|_F^2 / 2)$.
Its characteristic function is also log-quadratic: $\Phi_S(T) = \exp(-\text{tr}(T^2)/2)$.

**Symmetric Gaussian** distribution $N_{n,n}(M; \Sigma)$
on symmetric matrix manifold $\mathcal{S}(n)$,
where $M \in \mathcal{S}(n)$ and $\Sigma \in \mathcal{S}_+(n)$,
is the distribution of a transformed symmetric standard Gaussian:
$Y = \Sigma^{1/2} Z \Sigma^{1/2} + M$, where $Z \sim N_{n,n}(0; I_n)$.
PDF: (1) relation with the symmetric standard Gaussian PDF:
$p(Y; M, \Sigma) = |\Sigma|^{-(n+1)/2} \phi(\Sigma^{-1/2} (Y - M) \Sigma^{-1/2})$;
(2) explicit form: $p(Y; M, \Sigma) = z^{-1} \exp\{-\text{tr}[\Sigma^{-1} (Y - M)
\Sigma^{-1} (Y - M)] / 2\}$,
where normalizing constant $z = 2^{n/2} \pi^{n(n+1)/4} |\Sigma|^{(n+1)/2}$.

**Wishart** distribution $W_n(m, \Sigma)$ on positive-definite manifold $\mathcal{S}_+(n)$,
where degree of freedom $m \in \{n, n+1, \cdots\}$
and covariance matrix $\Sigma \in \mathcal{S}_+(n)$,
is the sampling distribution of the (unscaled) covariance matrix
of a zero-mean Gaussian random vector [@Wishart1928]:
let $y \sim N_n(0, \Sigma)$, if $Y = \{y_i\}_{i=1}^m$ is a random sample,
then $Y Y^T = \sum_{i=1}^m y_i y_i^T \sim W_n(m, \Sigma)$.
Notice that $Y \sim N_{n,m}(0; \Sigma, I_m)$ and $Y^T \sim N_{m,n}(0; I_m, \Sigma)$
are special forms of matrix-variate Gaussians.
PDF: (1) $p(S; m, \Sigma) = z^{-1} |S|^{(m-n-1)/2} \exp(-\text{tr}(\Sigma^{-1} S)/2)$,
where normalizing constant $z = 2^{mn/2} \Gamma_n(m/2) |\Sigma|^{m/2}$,
and $\Gamma_n(a)$ is the multivariate gamma function;
(2) norm version,
$p(S; m, \Sigma) = z^{-1} |S|^{(m-n-1)/2} \exp(-\|\Sigma^{-1/2} S^{1/2}\|_F^2 / 2)$.
Its characteristic function is: $\Phi_S(T) = |I_n - 2i \Sigma T|^{-m/2}$.
The Wishart distribution converges to the symmetric standard Gaussian
in a specific limit [@James1976]:
$\lim_{u \to \infty} p(u I_n + \sqrt{u} S; n+1+2u, I_n/2) = z^{-1} \exp(-\text{tr}(S^2)/2)$.

**Noncentral Wishart** distribution $W_n(m, \Sigma; \Omega)$
on positive-definite manifold $\mathcal{S}_+(n)$,
where non-centrality matrix $\Omega \in \mathcal{S}_{\ge 0}(n)$,
is the distribution of the sum of m squared equal-variance Gaussian random vectors [@Muirhead1982]:
let $y_i \sim N_n(m_i, \Sigma)$, and $Y = \{y_i\}_{i=1}^m$ is jointly independent,
then $Y Y^T = \sum_{i=1}^m y_i y_i^T \sim W_n(m, \Sigma, \Sigma^{-1} M M^T)$,
where $M = \{m_i\}_{i=1}^m$.
Notice that $Y \sim N_{n,m}(M; \Sigma, I_m)$ and $Y^T \sim N_{m,n}(M^T; I_m, \Sigma)$
are special forms of matrix-variate Gaussians.
PDF: $p(S; m, \Sigma, \Omega) = z^{-1} {}_0 F_{1}(m/2; \Omega \Sigma^{-1} S / 4)\ p(S; m, \Sigma)$,
where normalizing constant $z = \exp(\text{tr}(\Omega)/2)$,
and ${}_0 F_{1}(m/2; \Omega \Sigma^{-1} S / 4)$ is a hypergeometric function with a matrix argument,
which equals $\int_{O(m)} \exp(\text{tr}((\Omega \Sigma^{-1} S)^{1/2} H)) [d H]$.
Note that the noncentral Wishart generalizes the Wishart distribution:
$W_n(m, \Sigma; 0) = W_n(m, \Sigma)$.

An unnamed family of distributions on the positive-definite manifold is given by [@Hayakawa1966], see [Thm 2.4.2] below. Its exact computation is intractable. It reduces to the simplest Wishart distribution: $p(\cdot; I_n) = W_k(n, I_k)$.

**von Mises** distribution on the 1-shpere (i.e. circle) [@von-Mises1918]
is a closed-form approximation to the wrapped Gaussian distribution:
$p(x; m, \kappa) = z^{-1} \exp(\kappa \cos(x - m))$,
where $x \in [0, 2\pi)$, with mode (and intrinsic mean) $m \in [0, 2\pi)$,
concentration parameter $\kappa \ge 0$, and $z = 2 \pi I_0(\kappa)$.
Here $I_v$ is the modified Bessel function of the first kind of order v.
It is uniform when $\kappa = 0$, and deterministic when $\kappa \to \infty$.
**von Mises–Fisher** (vMF) distribution [@Fisher1953],
aka **Langevin** distribution $L_n(c)$ (attributing to [@Langevin1905]),
is the exponential-linear family on the n-spheres:
$p(x; c) = z^{-1} \exp(c^T x)$, where $x \in \mathbb{S}^{n-1}$, $c \in \mathbb{R}^n$,
and $z = (2 \pi) (2 \pi / \|c\|)^{n/2 - 1} I_{n/2-1}(\|c\|)$.
The vMF distribution generalizes the von Mises distribution to n-spheres,
and is the most commonly used distribution in directional/circular/spherical statistics.
It has a mode (and intrinsic mean) $\bar{c} = c / \|c\|$,
and it is rotationally symmetric around the mode.

**Bingham** distribution $B_n(B)$ [@Bingham1974]
is an exponential-quadratic family on the n-sphere:
$p(x; B) = z^{-1} \exp(x^T B x)$, where $B \in S(n)$ and
$z = {}_{1}F_{1}(1/2; n/2; \Lambda(B))$.
The matrix parameter is determined up to scalar shifts:
$B_n(B) = B_n(B + c I_n)$, $\forall c \in \mathbb{R}$.
The Bingham distribution is antipodally symmetric: $p(x) = p(-x)$.
Let $B = V \Lambda V^T$ where $\lambda \in \mathbb{R}^n_{\downarrow}$,
then the Bingham distribution has modes (and intrinsic means) $\pm v_1$.

**Angular central Gaussian** $\text{ACG}(\Sigma)$ (see e.g. [@Watson1983; @Tyler1987])
is a polynomial-quadratic family on the n-sphere, defined as:
$p(x; \Sigma) = z^{-1} (x^T \Sigma^{-1} x)^{-n/2}$,
where $\Sigma \in S_+(n)$ and $z = \sqrt{\det(\Sigma)}$.
Similar to the Bingham distribution, the ACG distribution:
(1) is antipodally symmetric;
(2) let $\Sigma = V \Lambda V^T$ where $\lambda \in \mathbb{R}^n_{+\downarrow}$,
then it has modes (and intrinsic means) $\pm v_1$.

Miscellaneous:
**Langevin-Bingham** [@Mardia1975] combines the Langevin and the Bingham distributions,
and is the general exponential-quadratic family on the n-sphere,
with PDF proportional to $\exp(x^T B x + c^T x)$.
**Bivariate von Mises** is the exponential-bilinear family on the 2-torus
[@Mardia1975], see also [@Mardia1999, Sec 3.7.1]:
$p(x, y; A, b, c) \propto \exp(x^T A y + b^T x + c^T y)$,
where $x, y \in \mathbb{S}^1$, $A \in M_2$, and $b, c \in \mathbb{R}^2$.
Despite the name, its marginal distributions are not von Mises in general:
this happens if and only if x and y are either independent (A = 0) or uniform ($b_2 = c_2 = 0$).

**Uniform** distribution on the Stiefel manifold w.r.t. the invariant measure
is the normalized invariant measure, which is possible because the invariant measure is finite:
$[d X] = (d X) / v(k, n)$, where $v(k, n) = 2^k \pi^{kn/2} \Gamma_k^{-1}(n/2)$.
Note that $v(k, n) := \int_{V_{k,n}} (d X)$
should not be called the volume or mass of the Stiefel manifold
because the invariant differential n-form is defined differently from the Riemannian volume form.
PDF: $p(X) = 1$.

**Matrix Langevin** $L_{k,n}(F)$ or **matrix von Mises-Fisher** (matrix vMF), where $F \in M_{n,k}$,
is an exponential-linear family on the Stiefel manifold $V_{k,n}$ [@Downs1972].
PDF: $p(X; F) = z^{-1} \exp(\text{tr}(F^T X))$,
where normalizing constant $z = {}_{0} F_{1} (n/2; F^T F / 4)$.
It is analgous to the distribution $\exp(c x)$,
but on a compact domain and can have non-negative coefficients.
The von Mises-Fisher (vMF) for n-spheres and matrix vMF for Stiefel manifolds
are in the larger family of maximum entropy distributions with moment constraints [@Pennec2006].
A matrix-variate Gaussian restricts to matrix Langevin on the Stiefel:
if $X \sim N_{n,k}(M; I_n, \Sigma)$, then $X|_{V_{k,n}} \sim L_{k,n}(M \Sigma^{-1})$.
Matrix Langevin on the Stiefel contains the uniform distribution: $L_{k,n}(0) = U(V_{k,n})$.

**Matrix Bingham** $B_{k,n}(B)$, where $B \in \mathcal{S}(n)$,
is an exponential-quadratic family on the Stiefel manifold $V_{k,n}$.
PDF: $p(X; B) = z^{-1} \exp(\text{tr}(X^T B X))$,
where normalizing constant $z = {}_{1} F_{1} (k/2; n/2; B/2)$.
It is analgous to $\exp(c x^2)$, but on a compact domain and can have non-negative coefficients.
It is right orthogonal invariant, and therefore explicitly defines a distribution
on the Grassmann manifold (see matrix Langenvin on the Grassmann).
Matrix Bingham contains the uniform distribution: $B_{k,n}(I_n) = U(V_{k,n})$.

Consider the orthonormalization or projection $\pi: M^∗_{n,k} \mapsto V_{k,n}$
defined by $\pi(M) = U V^T$, where $M = U \Sigma V^T$ is a thin SVD.
Note that although SVD is not unique, the mapping is uniquely defined.
*Theorem* (Orthonormalization induced distribution) [Thm 2.4.1]:
Let a random n-by-k matrix $M \sim \tilde{f}$, denote $\pi(M) \sim f$ and $M^T M \sim g$,
then $f(X) = c(n,k) \int_{\mathcal{S}_+(k)} \tilde{f}(X S^{1/2}) |S|^{(n-k-1)/2} (d S)$,
and $g(S) = c(n,k) |S|^{(n-k-1)/2} \int_{V_{k,n}} \tilde{f}(X S^{1/2}) [d X]$,
with coefficient $c(n,k) = \pi^{kn/2} \Gamma_k^{-1}(n/2)$.
*Theorem* (Orthonormalization of left orthogonal invariant Gaussian) [Thm 1.5.5, 2.4.3]:
Let $M \sim N_{n,k}(0; I_n, \Sigma)$,
then $\pi(M) \sim U(V_{k,n})$, $M^T M \sim W_k(n, \Sigma)$, and $\pi(M) \perp M^T M$.
*Theorem* (Orthonormalization of right orthogonal invariant Gaussian) [Thm 2.4.2]:
Let $M \sim N_{n,k}(0; \Sigma, I_k)$,
then $\pi(M) \sim \text{MACG}(\Sigma)$, and
$M^T M \sim p(S; \Sigma) = z^{-1} |S|^{(n-k-1)/2} {}_{0} F_{0}^{(n)}(-\Sigma^{-1}/2, S)$,
where normalizing constant $z = 2^{nk/2} \Gamma_{k}(n/2) |\Sigma|^{k/2}$,
and ${}_{0} F_{0}^{(n)}(S, T) := \int_{O(n)} \exp(\text{tr}(S H T H^T)) [d H]$,
with $S, T \in \mathcal{S}(n)$ [Eq A.6.6].

**Matrix angular central Gaussian** $\text{MACG}(\Sigma)$ on the Stiefel manifold $V_{k,n}$,
where $\Sigma \in \mathcal{S}_+(n)$, is the distribution induced by orthonormalization
of a right orthogonal invariant Gaussian [@Tyler1987]:
if $M \sim N_{n,k}(0; \Sigma, I_k)$, then $\pi(M) \sim \text{MACG}(\Sigma)$.
Equivalently, if a random n-by-k matrix has a PDF of the form
$p(M; \Sigma) \propto g(M^T \Sigma^{-1} M)$ and is right orthogonal invariant
(i.e., $g(A) = g(\Lambda(A))$ for all $A \ge 0$),
then $\pi(M) \sim \text{MACG}(\Sigma)$.
Note that for $N_{n,k}(0; \Sigma, I_k)$, we have $g(S) = \exp(-\text{tr}(S)/2)$.
More generally, g() may build on matrix-variate functions (trace, determinant, spectral radius),
real functions, and (optionally) matrix functions (matrix polynomial / exponential / logarithm).
PDF: $p(X; \Sigma) = z^{-1} |X^T \Sigma^{-1} X|^{-n/2}$,
where normalizing constant $z = |\Sigma|^{k/2}$.
The parameter is determined up to scaling:
$\text{MACG}(\Sigma) = \text{MACG}(c \Sigma)$, $\forall c > 0$.
Evaluating the PDF with a diagonal $\Sigma$:
matrix multiplication takes $n k^2$ FLOPs; determinant takes $\mathcal{O}(k^3)$ FLOPs.
It is (also) right orthogonal invariant, and therefore explicitly defines a distribution
on the Grassmann manifold (see MACG on the Grassmann).
MACG contains the uniform distribution: $\text{MACG}(I_n) = U(V_{k,n})$.
If $M \sim N_{n,k}(0; \Sigma, I_k)$, then $\pi(B M) \sim \text{MACG}(B \Sigma B^T)$;
if $B^T B = \Sigma^{-1}$, then $B \Sigma B^T = I_n$ and therefore $\pi(B M) \sim U(V_{k,n})$.

An unnamed general family $\mathcal{F}^{(V)}(B; d)$ on the Stiefel manifold $V_{k,n}$, where $B \in \mathcal{S}(n)$ and $d: \mathbb{Z}_{+\downarrow}^\infty \mapsto \mathbb{R}$, is defined by PDF of the form [Eq 2.3.9]: $p(X; B; d) = z^{-1} g(X^T B X)$, where normalizing constant $z = g_0(B)$. Here, $g(S) = \sum_{l=0}^\infty (l!)^{-1} \sum_{\lambda \vdash l} d_{\lambda} C_{\lambda}(S)$, and $g_0(B) = \sum_{l=0}^\infty (l!)^{-1} \sum_{\lambda \vdash l} d_{\lambda} (k/2)_{\lambda} (n/2)_{\lambda}^{-1} C_{\lambda}(B)$. For the definitions of normalized zonal polynomial $C_{\lambda}(S)$ and generalized hypergeometric coefficient $(a)_{\lambda}$, see hypergeometric function with a matrix argument. Every distribution in this family is right orthogonal invariant, and therefore explicitly defines a distribution on the Grassmann manifold (see unnamed general family on the Grassmann). It can be generalized using invariant polynomials with mutiple matrix arguments [Eq 2.3.17]. Special caes have a simpler form [Eq 2.3.12-16], which include matrix Bingham and MACG.

Sampling: geodesic Monte Carlo [@Byrne2013] is applicable.

**Uniform** distribution on the Grassmann manifold w.r.t. the invariant measure
is the normalized invariant measure, which is possible because the invariant measure is finite:
$[d P] = (d P) / g(k, n)$, where $g(k, n) = \pi^{k(n-k)/2} \Gamma_k(k/2) \Gamma_k^{-1}(n/2)$.
Note that $g(k, n) := \int_{\mathcal{P}(k,n)} (d P)$
should not be called the volume or mass of the Grassmann manifold,
because the invariant differential n-form is defined differently from the Riemannian volume form,
and the Grassmann manifold is not orientable when n is odd.
PDF: $p(P) = 1$.
Distributions of the principal angles between two random subspaces:
[@Absil2006; derivation based on @Muirhead1982]
let $X, Y \sim U(G_{k,n})$, $n \ge 2 k$, and $\theta = \theta(X, Y) \in \mathbb{R}^k_{\uparrow}$,
(1) joint PDF (eq 3), $p(\theta) = z^{-1} \prod_{i=1}^k \sin^{n-2k}\theta_i
\prod_{i < j} (\cos^2 \theta_i - \cos^2 \theta_j)$,
where $z = (2 \pi^{k^2/2})^{-1}~\Gamma_k^2(k/2)~\Gamma_k((n-k)/2)~\Gamma_k^{-1}(n/2)$
and $\Gamma_k$ is the multivariate Gamma function.
(2) marginal PDF (Thm 1),
$p(\theta_k) = z^{-1} (\sin \theta_k)^{k(n-k)-1} ~{}_{2}
F_{1}((n-k-1)/2, 1/2; (n+1)/2; \sin^2\theta_k I_{k-1})$, where
$z = [k (n-k)]^{-1} ~\Gamma(1/2) ~\Gamma((n+1)/2)~\Gamma^{-1}((k+1)/2) ~\Gamma^{-1}((n-k+1)/2)$.
Note that the joint PDF is zero when $\theta_i = \theta_j$ and at $\theta_1 = 0$ when n > 2 k.
With a fixed k, the unnormalized $p(\theta_k)$ changes by a factor $(\sin \theta_k)^{k n}$,
which means the largest principal angle quickly concentrates to $\pi/2$.
It remains an open problem to find a close-form expression for
the distribution of the Riemannian distance, $p(\|\theta\|)$.

Any PDF on the Stiefel manifold that is right orthogonal invariant explicitly defines a distribution on the Grassmann manifold: let $X \in V_{k,n}$, $X \sim p(X)$, and $p(X) = p(X Q)$, $\forall Q \in O(k)$, then $X X^T \sim p(X)$. This applies to matrix Langevin and matrix angular central Gaussian.

**Matrix Langevin** $L^{(P)}_{k,n}(B)$, where $B \in \mathcal{S}(n)$,
is an exponential-linear family on the rank-k symmetric projection matrix manifold
$\mathcal{P}(k,n)$ [@Watson1991; @Chikuse and Watson, 1995].
PDF: $p(P; B) = z^{-1} \exp(\text{tr}(B P))$,
where normalizing constant $z = {}_{1} F_{1} (k/2; n/2; B/2)$.
It is analgous to the distribution $\exp(c x)$,
but on a compact domain and can have non-negative coefficients.
Its PDF is the same as matrix Bingham, with $P = X X^T$.
Matrix Bingham induces matrix Langevin on the Grassmann by submersion:
if $X \sim B_{k,n}(B)$, then $X X^T \sim L^{(P)}_{k,n}(B)$.
A symmetric Gaussian restricts to matrix Langevin on the Grassmann:
if $S \sim N_{n,n}(B, I_n)$, then $S|_{\mathcal{P}(k,n)} \sim L^{(P)}_{k,n}(B)$.
Matrix Langevin on the Grassmann contains the uniform distribution:
$L^{(P)}_{k,n}(0) = U(\mathcal{P}(k,n))$.

**Matrix angular central Gaussian** $\text{MACG}(\Sigma)$ on the Grassmann manifold
is the distribution induced by the orthonormalized projection
of a right orthogonal invariant Gaussian [Rmk 2.4.11]:
if $M \sim N_{n,k}(0; \Sigma, I_k)$, then $\pi(M) \pi(M)^T \sim \text{MACG}(\Sigma)$.
PDF: (1) Stiefel representation: $p(X X^T; \Sigma) = z^{-1} |X^T \Sigma^{-1} X|^{-n/2}$,
where normalizing constant $z = |\Sigma|^{k/2}$ (This is exactly the same as MACG for the Stiefel);
(2) projection matrix: $p(P; \Sigma) = z^{-1} |I_n - (I_n - \Sigma^{-1}) P|^{-n/2}$.
Note that the two forms are equivalent: by Sylvester's determinant theorem,
$|I_n - (I_n - \Sigma^{-1}) X X^T| = |I_k - X^T (I_n - \Sigma^{-1}) X|$,
the right-hand side equals $|I_k - X^T X + X^T \Sigma^{-1} X|$ where the first two terms cancels,
and therefore $|I_n - (I_n - \Sigma^{-1}) X X^T| = |X^T \Sigma^{-1} X|$.
MACG on the Stiefel induces MACG on the Grassmann by submersion:
if $X \sim \text{MACG}(\Sigma)$, then $X X^T \sim \text{MACG}(\Sigma)$.
All properties of MACG on Stiefel apply.
Its Fréchet mean is $\text{span}(V_k)$, the top-k eigenspace of $\Sigma$:
Let $\Sigma = V \Lambda V^T$ be an EVD, denote $V_k = [v_1, \cdots, v_k]$.
Its uncertainty is compactly described by the eigenvalues of $\Sigma$:
the larger an eigenvalue $\lambda_i$ is,
the more important is the associated eigenspace $\text{span}(v_i)$.

**Orthogonal projective Gaussian** $\text{OPG}(B)$
on the rank-k symmetric projection matrix manifold $\mathcal{P}(k,n)$,
where $B \in \cup_{i=0}^k \mathcal{S}_{\ge 0}(i, n)$,
is the distribution induced by the orthonormalized projection
of a (right-transformed) matrix-variate Gaussian:
if $M \sim N_{n,k}(M; I_n, \Sigma)$, then $\pi(M) \pi(M)^T \sim \text{OPG}(M \Sigma^{-1} M^T)$.
Note that $\text{MACG}(\Sigma)$ can be induced in the same way
but with $M \sim N_{n,k}(0; \Sigma, I_k)$.
PDF: $p(P; B) = z^{-1} {}_{1} F_{1}(n/2; k/2; BP/2)$,
where normalizing constant $z = \exp(\text{tr}(B)/2)$.
Compared with matrix Langevin $L^{(P)}_{k,n}(B)$,
the PDF uses ${}_{1} F_{1}$ instead of ${}_{0} F_{0}$,
and therefore its exact computation is intractable.
Joint distributions can be defined on products of Grassmann manifolds, $\prod_{i=1}^m G_{k_i,n}$,
where the marginal and conditional distributions are OPG (see @Chikuse2003, p.47).
Its Fréchet mean is $\text{span}(B)$.
Its uncertainty is also compactly described by the (nonzero) eigenvalues
$(\lambda_1, \cdots, \lambda_k)$ of B.
OPG on the Grassmann contains the uniform distribution:
$\text{OPG}(0) = U(\mathcal{P}(k,n))$.

An unnamed general family $\mathcal{F}^{(P)}(B; d)$ on the rank-k symmetric projection matrix manifold $\mathcal{P}(k,n)$, is defined by PDF of the form [Eq 2.3.20]: $p(P; B; d) = z^{-1} g(B P)$, where normalizing constant $z = g_0(B)$. See the unnamed general family $\mathcal{F}^{(V)}(B; d)$ on Stiefel for defintions of the notations. All properties of the unnamed general family on Stiefel apply.

Riemannian symmetric spaces: Gaussian-like distribution $p(x) \propto \exp[- d^2(x, \mu) / (2 \sigma^2)]$ [@Said2018],

- positive definite matrices [@Said2017] with a certain structure, e.g. complex, Toeplitz, or block-Toeplitz;
- Grassmannian

Heat kernels:

- 1-sphere: wrapped Gaussian.

**Wrapped Gaussian distribution** on the 1-sphere is its heat kernel,
which "wraps" the Gaussian distribution around the unit circle: $f_{WN} (\theta; \mu, \sigma) =
\sum_{k \in \mathbb{Z}} \exp[-(\theta-\mu+2\pi k)^2/(2\sigma^2)] / \sqrt{2 \pi \sigma^2}$.
Heat kernels of complete Riemannian manifolds do not have closed forms in general,
see Geometric Diffusion.

[@Yuchi2022, Prop 3.1] If a singular matrix-variate Gaussian distribution $N_{m,n}(0; \sigma^2 P_U, \sigma^2 P_V)$ has its row and column subspace parameters uniformly distributed, $P_U \sim U(G_{k,m})$ and $P_V \sim U(G_{k,n})$, then its left and right singular bases are uniformly distributed, $U \sim U(V_{k,m})$ and $V \sim U(V_{k,n})$, and its unordered singular values follow the repulsed normal distribution $RN(0, \sigma^2)$.