Matrix Manifold

Certain sets of matrices may be endowed a manifold structure, which can help us understand or solve problems. This article looks at some common matrix manifolds.

Matrix Manifolds

The generic, m-by-n matrix manifold $M_{m, n}(\mathbb{R})$ is the (m n)-dimensional real vector space of m-by-n real matrices, endowed with the inner product $\langle A, B \rangle = \text{tr}(A B^T)$. It is diffeomorphic to, and often identified with, the Euclidean (m n)-space: $M_{m, n}(\mathbb{R}) \cong \mathbb{R}^{mn}$. Probability models: matrix-variate Gaussian distribution $N_{n,k}(M; \Sigma_n, \Sigma_k)$. Full-rank manifold $M^∗_{m, n}(\mathbb{R})$ or $\mathbb{R}_∗^{mn}$ is the set of m-by-n full-rank real matrices, as an open subset of the m-by-n matrix manifold: $M^∗_{m, n}(\mathbb{R}) = \{X \in M_{m, n}(\mathbb{R}) : \text{rank}(X) = \min(m, n)\}$; or equivalently, $M^∗_{m, n}(\mathbb{R}) = M_{m, n}(\mathbb{R}) \setminus \bigcup_{k=0}^{\min(m,n)} \mathcal{M}(k, m \times n)$. It is a disconnected Riemannian (m n)-manifold. If m = n, its underlying set coincides with the general linear group of nonsingular order-n real matrices: $M^∗_n(\mathbb{R}) = \text{GL}(n, \mathbb{R})$.

Rank-k manifold $\mathcal{M}(k, m \times n)$ is the set of m-by-n real matrices of rank $k \in \{0, \dots, \min(m, n)\}$, as an embedded Riemannian $k (m + n - k)$-submanifold of the m-by-n matrix manifold: $\mathcal{M}(k, m \times n) = \{X \in M_{m, n}(\mathbb{R}) : \text{rank}(X) = k\}$; its Riemannian metric is induced from the Euclidean metric of the m-by-n matrix manifold. Alternatively, it can be defined as a quotient manifold, with the order-k general linear group acting on the full-rank manifold of (m+n)-by-k matrices: $\mathcal{M}(k, m \times n) = M^∗_{m+n,k} / \text{GL}_k$, with equivalence class $[(M, N)] = {(M Q, N Q^{-T}) : Q \in \text{GL}_k}$, where $M \in M^∗_{m,k}$ and $N \in M^∗_{n,k}$. From this definition, it is easy to see that it has dimension $k (m + n - k)$. Riemannian metrics can be induced on it via the Riemannian submersion theorem: let $\pi: M^∗_{m,k} \times M^∗_{n,k} \mapsto \mathcal{M}(k, m \times n)$, $\pi(M, N) = M N^T$, which is a surjective smooth submersion, consider the fiber bundle $(M^∗_{m+n,k}, \pi)$ with the (aforementioned) action of group $\text{GL}_k$ and an arbitrary Riemannian metric, there is a unique Riemannian metric on $\mathcal{M}(k, m \times n)$ such that π a Riemannian submersion (see e.g. [@Absil2014]).

Square matrix manifolds

Symmetric matrix manifold $\mathcal{S}(n)$ is the set of order-n real symmetric matrices, as the embedded Riemannian n(n+1)/2-submanifold of the order-n matrix manifold: $\mathcal{S}(n) = \{X \in M_n(\mathbb{R}) : X = X^T\}$. Probability models: symmetric Gaussian distribution $N_{n,n}(M; \Sigma)$. Positive-semidefinite manifold $\mathcal{S}_{\ge 0}(n)$ is the order-n positive semi-definite cone, as a regular domain of the symmetric matrix manifold: $\mathcal{S}_{\ge 0}(n) = \{X \in \mathcal{S}(n) : X \ge 0\}$. Positive-definite manifold $\mathcal{S}_+(n)$ is the set of order-n positive-definite matrices, as an open subset of the symmetric matrix manifold: $\mathcal{S}_+(n) = \{X \in \mathcal{S}(n) : X > 0\}$. Probability models: (noncentral) Wishart distribution $W_n(m, \Sigma; \Omega)$.

Rank-k positive-semidefinite manifold $\mathcal{S}_{\ge 0}(k, n)$ is the set of rank-k order-n positive-semidefinite matrices, as a Riemannian $k (2n - k + 1) / 2$-submanifold of the order-n matrix manifold: $\mathcal{S}_{\ge 0}(k, n) = \mathcal{S}_{\ge 0}(n) \cap \mathcal{M}(k, n \times n)$. It is related to the product manifold of the Stiefel manifold and the non-increasing positive Euclidean k-space by a surjective map: $f: V_{k, n} \times \mathbb{R}^k_{+\downarrow} \mapsto \mathcal{S}_{\ge 0}(k, n)$, $f(V, \lambda) = V \text{diag}(\lambda) V^T$, where $\mathbb{R}^k_{+\downarrow} = \{x \in \mathbb{R}^k_+ : \forall i < j, x_i \ge x_j\}$. Notice that the domain and codomain have the same manifold dimension. This also applies to $\mathcal{S}_+(n)$ with k = n.

Rank-k symmetric projection manifold $\mathcal{P}(k, n)$ is the set of rank-k symmetric projection matrices, as an embedded Riemannian k(n-k)-submanifold: $\mathcal{P}(k, n) = \{P \in \mathcal{S}(n) : P^2 = P, \text{rank}(P) = k\}$. Its smooth manifold structure is derived from Lie group theory: because O(n) acts smoothly on $\mathcal{S}(n)$ by the map $\phi: O(n) \times \mathcal{S}(n) \mapsto \mathcal{S}(n)$ defined by $\phi(Q, S) = Q S Q^T$, the orbit $O(n) \cdot P_k = \mathcal{P}(k,n)$ of the point $P_k = I_{n,k} I_{k,n} = (I_k, 0; 0, 0)$ is a properly embedded submanifold of $\mathcal{S}(n)$, and it is diffeomorphic to the quotient manifold $O(n) / (O(n - k) \times O(k))$; here $O(n - k) \times O(k)$ is used in place of $\{\text{diag}(R_1, R_2) : R_1 \in O(k), R_2 \in O(n-k)\}$, the isotropy group of $P_k$, because they are diffeomorphic. Its Riemannian metric is induced from the Euclidean metric of the order-n matrix manifold. With this metric, the surjective smooth submersion $\pi: O(n) \mapsto \mathcal{P}(k,n)$ defined by $\pi(Q) = \phi(Q, P_k)$ is a Riemannian submersion. Its tangent space can be written as: $T_P \mathcal{P(k,n)} = \{Q (0, K^T; K, 0) Q^T : K \in M_{n-k,k}\}$, where Q is an orthogonal representation of P such that $\pi(Q) = P$; note that the tangent vectors are symmetric matrices (i.e. in the tangent space of $\mathcal{S}(n)$). The horizontal tangent space at Q can be written as: $H^\pi_Q O(n) = \{Q (0, -K^T; K, 0) : K \in M_{n-k,k}\}$. The vertical tangent space at Q can be written as: $V^\pi_Q O(n) = \{Q (\Omega, 0; 0, \bar{\Omega}) : \Omega \in \Omega(k), \bar{\Omega} \in \Omega(n-k)\}$. Note that the right-hand side matrices are anti-symmetric (i.e. the outcome is in the tangent space of O(n)). The horizontal lift of tangent vector $Z \in T_P \mathcal{P}(k,n)$ to Q can be calculated as: $Z_Q = [Z, P] Q = (Z P - P Z) Q$. With the surjective smooth submersion $\pi: V_{k,n} \mapsto \mathcal{P}(k,n)$ defined by $\pi(X) = X X^T$, the horizontal lift of tangent vector Z to X can be calculated as $Z_X = Z X$. It is a Riemannian symmetric space, and every point P has a point reflection $\sigma^P: \mathcal{P}(k,n) \mapsto \mathcal{P}(k,n)$ defined by $\sigma^P(\tilde{P}) = S \tilde{P} S^T$, where $S = Q S_k Q^T$, $P = Q P_k Q^T$ is an eigen-decomposition, and $S_k = \text{diag}(I_k, -I_{n-k})$. Its Riemann (1,3)-curvature tensor at $P_k$ is $R(x, y) z = b$, where $x, y, z, b \in T_{P_k} \mathcal{P}(n,k)$ have the form $x = (0, X^T; X, 0)$ with $X \in M_{n-k,k}$, and $B = Z X^T Y - Z Y^T X - X Y^T Z + Y X^T Z$. When $k (n-k) \ge 2$, its sectional curvature at a point P is: $K_P(Z_1, Z_2) = 4 (\text{tr}(Z_1^2 Z_2^2) - \text{tr}((Z_1 Z_2)^2)) / (\text{tr}(Z_1^2) \text{tr}(Z_2^2) - \text{tr}^2(Z_1 Z_2)))$; or more compactly, using the Euclidean metric and the bracket, $K_P(Z_1, Z_2) = 2 \| [Z_1, Z_2] \|_F^2 / (\|Z_1\|_F^2 \|Z_2\|_F^2 - \langle Z_1, Z_2 \rangle_0^2)$. For $\mathcal{P}(1, n)$ (and $\mathcal{P}(n - k, n)$), n > 2, the sectional curvature is always 1. For all other cases with $k (n-k) \ge 2$, the sectional curvature has a range [0, 2].

Skew symmetric matrix manifold $\Omega(n)$ is the set of order-n real skew symmetric matrices, as the embedded Riemannian n(n-1)/2-submanifold of the order-n matrix manifold: $\Omega(n) = \{X \in M_n(\mathbb{R}) : X = - X^T\}$.

Other matrix manifolds:

Euclidean group $\text{SE}(3)$ is the Cartesian product of the order-3 special orthogonal group and the Euclidean 3-space: $\text{SE}(3) = \text{SO}(3) \times \mathbb{R}^3$.
Oblique manifold $\text{OB}_{m,n}$ is the set of m-tuples of points on the (n-1)-sphere that span the Euclidean n-space: $\text{OB}_{m,n} = \{X \in M^∗_{m, n}: \text{diag}(X X^T) = I\}$, or equivalently $\text{OB}_{m,n} = \{X \in \prod_{i=1}^m \mathbb{S}^{n-1} : \text{Span}~X = \mathbb{R}^n\}$.
Essential manifold is the set of essential matrices, i.e. the product of a skew-symmetric matrix and an orthogonal matrix: $E_n = \{\Omega Q : \Omega \in \Omega(n), Q \in O(n)\}$.
Flag manifold $F_K(\mathbb{R}^n)$ is the set of all flags of type K in the Euclidean n-space, i.e. a nested sequence of linear subspaces: given $K = (k_i)_{i=1}^m$, $F_K(\mathbb{R}^n) = \{(S_i)_{i=1}^m : S_i \in G_{k_i,n}, \forall i < j, S_i \subset S_j \}$.
Size-and-shape space $S\Sigma_{k,n}$ of k-tuples in the Euclidean n-space [@Kendall1999, Sec 11.2]: $S\Sigma_{k,n} = \mathbb{R}^{n \times (k-1)} / \text{SO}(n)$; which can be used as an alternative to the positive semi-definite manifold.

Stiefel manifold

Stiefel manifold $V_{k, n}$ is the collection of orthonormal k-frames in the Euclidean n-space, as an embedded Riemannian submanifold of the m-by-k matrix manifold: $V_{k, n} = \{X \in M_{n,k}(\mathbb{R}) : X^T X = I_k\}$, $k \in \{1, \dots, n\}$. It is a submersion level set $F(X) = X^T X - I_k = 0$, and therefore an embedded submanifold [@Absil2008, 3.3.2]. Riemannian metric is induced from Euclidean metric: $g_X(Z, W) = \text{tr}(Z^T W)$. The smooth manifold structure can also be derived from Lie group theory: because O(n) acts smoothly on $M_{n,k}$, the orbit $O(n) \cdot I_{n,p} = V_{k,n}$ of the point $I_{n,p} = (I_p; 0)$ is a properly embedded submanifold of $M_{n,k}$, and it is diffeomorphic to the quotient manifold $O(n) / O(n - k)$; here O(n - k) is used in place of the isotropy group $\{\text{diag}(I_k, R) : R \in O(n-k)\}$ because they are diffeomorphic. If k = 1, it is the (n-1)-sphere: $V_{1,n} = \mathbb{S}^{n-1}$, with dimension (n - 1). If k = n, it coincides with the orthogonal group: $V_{n,n} = O(n)$, with dimension n (n - 1) / 2.

Manifold property. Dimension, k (2n - k - 1) / 2. For O(n), n (n-1) / 2; for (n-1)-sphere, n-1. Compact, because it is diffeomorphic to the quotient manifold $O(n) / O(n-k)$. Connectedness: (n-k-1)-connected (see n-connected space). When k = n, it is O(n), which has two components, with determinant 1 for SO(n) and -1 for the other component. When k = n-1, it is path-connected. When k < n-1, it is simply connected. Homogeneous, because O(n) acts transitively on the left. Complete (if k = n, both components are metrically complete). Isometry group includes O(n) acting on the left, and O(k) acting on the right.

Constructs as a submanifold. Tangent space has the following equivalent forms: (1) as linear constraints, $T_X V_{k, n} = \{Z \in M_{n,k} : X^T Z + Z^T X = 0\}$; (2) in a "constructive" form, $T_X V_{k, n} = \{X \Omega + X_\perp K : \Omega \in \Omega(k), K \in M_{n-k,k} \}$, given an arbitrary orthogonal completion $(X, X_\perp) \in O(n)$; (3) in a projective form (aka tangential projection), $T_X V_{k, n} = \{X~\text{skew}(X^T M) + (I_n - X X^T) M : M \in M_{n,k}\}$, where $\text{skew}(A) = (A - A^T) / 2$. For orthogonal group, $T_Q O(n) = \{Q \Omega : \Omega \in \Omega(n)\}$. Normal space (w.r.t. the canonical metric): (1) in a constructive form, $N_X V_{k, n} = \{X S : S \in \mathcal{S}(k)\}$; (3) in a projective form, $N_X V_{k, n} = \{X~\text{sym}(X^T M) : M \in M_{n,k}\}$, where $\text{sym}(A) = (A + A^T) / 2$.

Canonical metric. The Riemannian metric of the orthogonal group is scaled by 1/2 in some conventions [@Zimmermann2017; @Bendokat2020]: $g_Q(Z, W) = 1/2~\text{tr}(Z^T W)$, where $Z, W \in T_Q O(n)$; because tangent vectors of O(n) has the form $Q \Omega$ with $\Omega \in \Omega(n)$, the metric can also be written as $g_Q(Q \Omega_1, Q \Omega_2) = 1/2~\text{tr}(\Omega_1^T \Omega_2)$. This does not change the geometry of O(n): the tangent vectors are scaled by $1/\sqrt{2}$, and so is the Riemannian distance function. The "canonical metric" on the Stiefel manifold is induced from the orthogonal group via a smooth submersion $\pi_k: O(n) \mapsto V_{k,n}$, which takes an order-n matrix to its first k columns: $\pi_k(Q) = Q I_{n,k} = (q_i)_{i=1}^k$. The horizontal tangent space at Q can be written as: $H^{\pi_k}_Q O(n) = \{Q (\Omega, -K^T; K, 0) : \Omega \in O(k), K \in M_{n-k,k}\}$. The vertical tangent space at Q can be written as: $V^{\pi_k}_Q O(n) = \{Q (0, 0; 0, \bar{\Omega}) :\bar{\Omega} \in O(n-k)\}$. Note that the right-hand side matrices are anti-symmetric (i.e. the outcome is in the tangent space of O(n)). For a tangent vector $Z = X \Omega + X_\perp K \in T_X V_{k,n}$, its horizontal lift to $Q \in O(n)$ via $\pi_k$ can be written as $Z_Q = Q \tilde{\Omega}$, where $\tilde{\Omega} = (\Omega, -K^T; K, 0)$. Therefore the canonical metric of the Stiefel manifold is $g_X(Z_1, Z_2) = g_Q(Z_{1,Q}, Z_{2,Q})$, which equals $1/2~\text{tr}(\Omega_1^T \Omega_2) + \text{tr}(K_1^T K_2)$, and can be computed as $g_X(Z_1, Z_2) = \text{tr}(Z_1^T (I_n - 1/2 X X^T) Z_2)$. Note that the metric on O(n) is consistent with the canonical metric of the Stiefel manifold. The canonical metric gives the Stiefel manifold a different geometry than that of a Riemannian submanifold of a Euclidean space.

Figure: Relations between common matrix manifolds and Grassmannians. Color indicates nature of the Riemannian metric: black, Euclidean; red, submanifold; blue, submersion; purple, submanifold and submersion. Doulbe underline: Riemannian metric has two distinct definitions. Blue arrow: (can be) a Riemannian submersion. Notation: a.e., almost everywhere; SVD, maps M to $U V^T$ where $M = U \Sigma V^T$ is a thin SVD.

Geometric operations. Riemannian metric: $g_X(Z, W) = \text{tr}(Z^T W) = \text{tr}(\Omega_Z^T \Omega_W + K_Z^T K_W)$, where $Z = X \Omega_Z + X_\perp K_Z$ and $W = X \Omega_W + X_\perp K_W$. Covariant derivative (w.r.t. the Levi-Civita / tangential connection): $\nabla_Z W(X) = P_X (\bar{\nabla}_Z W(X))$, where $Z \in T_X V_{k, n}$, $W \in \mathfrak{X}(V_{k, n})$. Riemannian distance function: closed form unknown? Exponential map: $\exp_X(Z) = (X, Z) \exp([X^T Z, -Z^T Z; I_k, X^T Z]) (I_k; 0) \exp(-X^T Z)$ (uses matrix exponential). Parallel transport: closed form unknown. Retractions (mostly based on matrix decompositions): (1) QR decomposition: $R(X, Z) = Q$, where $X + Z = Q R$, R has positive diagonal elements (modified Gram-Schmidt, a finite number of addition, multiplication, division, and square root). (2) Polar decomposition: $R(X, Z) = (X + Z) (I_k - Z^T Z)^{-1/2}$ (iterative $O(k^3)$ for eigen-decomposition + $O(nk^2)$ scalar additions and multiplications). Vector transports (associated with a retraction): (1) by differentiated retraction: $T_W(Z) = R \rho(R^T Z R') + (I - R R^T) Z R'$, where $R = R(X, W)$ is the QR-based retraction, $R' = (R^T (X + W))^{-1}$, and $\rho(A) = L - L^T$ (L is the lower triangle of A); (2) by tangential projection (as a submanifold): $T_W(Z) = (I - Y Y^T) Z + Y~\text{skew}(Y^T Z)$, where $Y = R(X, W)$ is any retraction;

Probability models [@Chikuse2003]: matrix Langevin $L_{k,n}(F)$, aka matrix von Mises-Fisher (vMF), exponential-linear; matrix Bingham $B_{k,n}(B)$, exponential-quadratic; matrix angular central Gaussian $\text{MACG}(\Sigma)$; max entropy with moment constraints [@Pennec2006]. Sampling: geodesic Monte Carlo is applicable [@Byrne2013].

Grassmann manifold

Grassmann manifold or Grassmannian $G_{k, n}$ or $G_k(\mathbb{R}^n)$ is an (abstract) Riemannian manifold, defined as follows: its underlying set consists of k-subspaces of the Euclidean n-space, $G_{k, n} = \{\text{Span}(M) : M \in M^∗_{n, k}\}$, $k \in \{1, \dots, n\}$; its smooth manifold structure is uniquely determined such that group $\text{GL}_n$ is a smooth left action on $G_{k,n}$ [@Lee2012, Ex 21.21]; its Riemannian metric is uniquely determined such that $\text{Span}: V_{k,n} \mapsto G_{k,n}$ is a Riemannian submersion, because the orthogonal group O(k) is an isometric right action on the Stiefel manifold $V_{k,n}$ [@Lee2018, Prob 2-7]. If k = 1, it is the (n-1)-dimensional projective space, i.e. the quotient manifold of lines in the Euclidean n-space: $G_{1,n} = \mathbb{RP}^{n-1}$. If k = n, it is a singleton consisting of the Euclidean n-space: $G_{n,n} = \{\mathbb{R}^n\}$. Note that $G_{k,n}$ and $G_{n-k,n}$ can be identified, because orthgonal complement $\perp: G_{k,n} \mapsto G_{n-k,n}$ is an isometry. Although the Grassmann manifold consists of subspaces rather than matrices, it is closely related to matrix manifolds and sometimes identified with $\mathcal{P}(k,n)$.

Manifold property. Dimension, k (n - k). Compact, because it is diffeomorphic to the quotient manifolds $V_{k, n} / O(k)$ [@Lee2012, Prob 21-13] and $O(n) / (O(n-k) \times O(k))$. Simply connected, except $G_{1, 2} \cong \mathbb{S}^1$; (see e.g. this paper) Symmetric, isotropic (and therefore homogeneous) [@Lee2018, Prob 3-20]. Complete. Isometry group includes O(n) on the left, and $N_G(H)/H$ on the right, where $N_G(H)$ denotes the normalizer of H in G, with G = O(n) and isotropy group $H = O(n-k) \times O(k)$.

Figure: Illustration of unique manifolds in the Stiefel and Grassman family. Legend: $\mathbb{S}$, (n-1)-sphere $\mathbb{S}^{n-1}$; $\mathbb{P}$, real projective (n-1)-space $\mathbb{P}^{n-1}$; O, order-n orthogonal group O(n); dot, the singleton of the entire Euclidean n-space $\{\mathbb{R}^n\}$; circle, the redundant Grassmann with n/2 < k < n; V and G, the "essential" Stiefel and the Grassmann manifolds respectively, which is not a more special manifold and is distinct from other manifolds in the family. Void entries are non-existent.

Representations of the abstract manifold. Because $\text{Span}: V_{k,n} \mapsto G_{k,n}$ and $\text{Span}: M^∗_{n, k} \mapsto G_{k,n}$ are Riemannian submersions, one may use $M \in M^∗_{n, k}$ as a representation of Span(M) to carry out all the computations related to the Grassmann manifold. Its equivalence class is $[M] = \{M A : A \in \text{GL}_k\}$; for $X \in V_{k, n}$, $[X] = \{X Q : Q \in O(k)\}$. For notational simplicity, we may use the equivalence class [M] and abstract element Span(M) interchangeably, when there is no ambiguity. Horizontal tangent space (represents the abstract tangent space; normal space to the fiber at the representation) has the following equivalent forms: (1) as linear constraints, $H_M = \{Z \in M_{n, k} : M^T Z = 0\}$; (2) in a "constructive" form, $H_X = \{X_\perp B : B \in M_{n-k,k}\}$ given an arbitrary orthogonal completion $(X, X_\perp) \in O(n)$; (3) in a projective form (aka horizontal projection), $H_M = \{(I_n - M (M^T M)^{-1} M^T) Z : Z \in M_{n,k}\}$, and for Stiefel representations, $H_X = \{(I_n - X X^T) Z : Z \in M_{n,k}\}$. With two representations M and M' of the same abstract element [M] such that M' = M A where $A \in \text{GL}_k$, if $Z_M, Z_{M'}$ represent the same abstract tangent vector $Z \in T_{[M]} G_{k, n}$, then $Z_{M'} = Z_M A$. Vertical tangent space (analogous to normal space of a submanifold, tangent space to the fiber at the representation) $V_M = \{M A : A \in M_{k, k}\}$; for Stiefel representations, $V_X = \{X \Omega : \Omega \in \Omega(k)\}$. Note that by definition, $V_X = T_X [X]$ and $H_X = V_X^\perp$ w.r.t. $T_X V_{k,n}$.

Representations of an explicit identification. One may identify the underlying sets of the Grassmannian and the rank-k symmetric projection manifold, because $\text{Range}: \mathcal{P}(k, n) \mapsto G_{k,n}$ is a bijection, and so is the map $P: G_{k,n} \mapsto \mathcal{P}(k, n)$ that takes a subspace to its orthogonal projection operator. Let quotient map $\pi: V_{k,n} \mapsto \mathcal{P}(k,n)$ with $\pi(X) = X X^T$. Let quotient map $\pi_k: O(n) \mapsto V_{k,n}$ with $\pi_k(Q) = Q I_{n,k} = (x_i)_{i=1}^k$, which takes the first k columns of a matrix. Without ambiguity, let $\pi: O(n) \mapsto \mathcal{P}(k,n)$ with $\pi(Q) = \pi(\pi_k(Q))$; note that $\pi(Q) = \phi(Q, P_k)$. These quotient maps π are surjective smooth submersions, and as Riemannian submersions, the induced Riemannian metric coincides with (1/2 of) the (restricted) Euclidean metric. One may use $X \in V_{k,n}$ or $Q \in O(n)$ as a representation of $\pi(X) = \pi(Q) \in \mathcal{P}(k,n)$, see e.g. [@Bendokat2020, Fig 2.1].

Geometric operations. In the following, an abstract element of the Grassmannnian (e.g. p) is replaced by a basis or Stiefel representation (e.g. M, X) of the subspace (e.g. [M], [X]), and an abstract tangent vector (e.g. v, Δ) is replaced by its horizontal lift to the corresponding representation (e.g. Z, W). Riemannian metric $g_M(Z, W) = \text{tr}((M^T M)^{-1} Z^T W)$, where $Z, W \in H_M$; for a Stiefel representation, $g_X(Z, W) = \text{tr}(Z^T W)$, which is the same as the Euclidean metric [@Absil2008, Prop 3.4.6, Sec 3.6.2]. Covariant derivative (w.r.t. the Levi-Civita connection) $\overline{\nabla_Z W(X)} = P_X (\bar{\nabla}_Z W(X))$. Exponential map: $\exp_M(Z) = M (M^T M)^{-1/2} W \cos(\Sigma) + U \sin(\Sigma)$; where $Z = U \Sigma W^T$ is a thin SVD; for a Stiefel representation, $\exp_X(Z) = X W \cos(\Sigma) + U \sin(\Sigma)$. Parallel transport along geodesics: given $X \in V_{k, n}$ and $Z \in H_X$, denote $Y = \exp_X(Z)$, the parallel transport map $P^Z_{XY}: H_X \cong H_Y$ can be computed as $P^Z_{XY} = (-X W \sin(\Sigma) + U \cos(\Sigma)) U^T + (I - U U^T)$, where $Z = U \Sigma W^T$ is a thin SVD. Retractions: $R_X(Z) = X + Z$. Vector transports (associated with a retraction): $T^W_{XY} = P_Y$ where $Y = X + W$; the same result by differentiated retraction and by horizontal projection. Principal angles between two subspaces: (1) given Stiefel representations X and Y, $\theta = \arccos(\sigma(X^T Y))$; (2) given the horizontal lift $Z = \log_X(Y) \in H_X$ of the Grassmann logarithm, $\theta = \text{rev}(\sigma(Z))$, where the singular values are reversed because $\theta$ is in non-decreasing order by convention. Riemannian distance function: $d_g([X], [Y]) = \|\theta([X], [Y])\|$. Note that the gap metric $\|X^T X - Y^T Y\| = \sin(\theta_k)$.

Cut locus and conjugate locus. The tangent cut locus and the tangent conjugate locus of any point p on a Grassmann manifold have simple characterizations on the horizontal tangent space of a Stiefel representation X, both expressed in terms of their singular values. The horizontal lift of the tangent cut locus of p is a sphere of radius π/2 in spectral norm: $\text{TCL}(p)_X = \{Z \in H_X : \sigma_1(Z) = \pi / 2\}$. The horizontal lift of the tangent conjugate locus of p corresponds to a countable collection of planes in the space of singular values: (1) if k ≠ n/2, then $\text{Conj}(p)_X = \{Z \in H_X : \sigma(Z) \in S_1 \cup S_2 \}$, where $S_1 = \cup_{i=1}^u \cup_{m=1}^\infty \{\sigma \in S : \sigma_i = m \pi\}$, $S_2 = \cup_{1 \le i < j \le u} \cup_{m=1}^\infty \{\sigma \in S : \sigma_i - \sigma_j = m \pi\}$, and $S = \mathbb{R}^u_{+\downarrow}$, where u = k if k ≤ n/2, and u = n-k if k > n/2; (2) if k = n/2, $S_1$ should be removed. Note that this is essentially proved in [@Bendokat2020, Thm 7.2], but they use conjugate points without reference to geodesics or tangent vectors, which is very confusing. For Grassmann manifolds, an abstract tangent vector has the same length as its horizotnal lift (w.r.t. the Euclidean metric not the canonical metric): let $\Delta \in T_p G_{k,n}$, then $\|\Delta\| = \|\Delta_X\|_F = \|\sigma(\Delta_X)\|$. Therefore, length is preserved if we map the horizontal tangent space $H_X$ to the space $S = \mathbb{R}^u_{+\downarrow}$ of singular values. Within this space, the (tangent) cut locus corresponds exactly to the (u-1)-plane, $\sigma_1 = \pi/2$; and the tangent conjugate locus corresponds exactly to the collection of planes, $S_1 \cup S_2$. In particular, the tangent cut locus of a Grassmann manifold always comes before the tangent conjugate locus, if the latter exists. Therefore, the Grassmann manifold at any point has injectivity radius: inj(p) = π/2; and the maximum distance between any two points is: $\max d_g(p, p') = \sqrt{u} \pi / 2$. The cardinality of shortest geodesic to a cut point is |O(r)|, where r denotes the number of principal angles equal to π/2.

Riemannian logarithm. Given Stiefel representations X and Y for $p, p' \in G_{k,n}$, let $\Delta_X \in H_X$ be the horizontal lift of the generalized Grassmann logarithm $\Delta = \log_p(p')$ to X. A modified algorithm for the (generalized) Grassmann logarithm computes $\Delta_X$, first described in [@Zimmermann2019] with properties in [@Bendokat2020, Thm 5.4, 5.5]: $Y^T X = \tilde{Q} \tilde{S} \tilde{R}^T$, an order-k SVD; $\bar{Y} = Y (\tilde{Q} \tilde{R}^T)$ and $\bar{X} = X (\tilde{R} \tilde{S} \tilde{R}^T)$; $\bar{Y} - \bar{X} = \hat{Q} \hat{S} \hat{R}^T$, an n-by-k thin SVD. Output: let $r = \sum_{i=1}^k 1(\hat{s}_i = 1)$, if r = 0, then $p' \notin \text{Cut}(p)$, and $\Delta_X = \hat{Q} \arcsin(\hat{S}) \hat{R}^T$, which retains the Stiefel representative: $Y = \exp_X(\Delta_X)$; if r > 0, then $p' \in \text{Cut}(p)$, and $\Delta_X = \{\hat{Q} \arcsin(\hat{S}) \text{diag}(W, I_{k−r}) \hat{R}^T : W \in O(r)\}$. Notice that many other quantities can be computed with intermeidate results in this algorithm: (1) principal angles $\theta = \arccos \tilde{s} = \text{rev}(\arcsin \hat{s})$, where rev() denotes reversing the order of a vector; in particular, the singular values of the horizontal lift $\Delta_X$ of the (generalized) Grassmann logarithm are the principal angles (in non-increasing order), $\theta = \text{rev}(\sigma(\Delta_X))$; (2) principal bases $\tilde{X} = X \tilde{Q}$ and $\tilde{Y} = Y \tilde{R}$; (3) Riemannian distance $d_g(p, p') = \|\theta\|$.

Probability models [@Chikuse2003]: matrix Langevin $L^{(P)}_{k,n}(B)$, uses trace; matrix angular central Gaussian $\text{MACG}(\Sigma)$, uses (order-k) determinant.

Matrix-valued Mapping

One may consider matrix-valued mappings $f: X \mapsto M_{m,n}(\mathbb{F})$. Matrix pencil or pencil of matrices is a one-parameter family of matrices, $f: \mathbb{F} \mapsto M_{m,n}(\mathbb{F})$. The most common form is a linear family of square matrices: $f(\lambda) = A + \lambda B$, where $A, B \in M_n$; sometimes it is written as (A, B). Consider that $M_{m,n}(\mathbb{R}) \cong \mathbb{R}^{m,n}$, the matrix pencil $A + \lambda B$ is equivalent to a line in the matrix manifold $M_{m,n}$. More generally, matrix pencil of degree l is a degree-l polynomial family of matrices: $f(\lambda) = \sum_{i=0}^l \lambda^i A_i$, where $A_l \ne 0$. Matrix pencils have applications in numerical linear algebra, control theory, etc.

Regular matrix pencil is one whose value is not always singular. Singular matrix pencil is one that is not regular. Eigenvalue of a matrix pencil is a complex number where the value of the matrix pencil is singular: $f(\lambda) \notin \text{GL}_n$. This may be the reason that matrix pencils use $\lambda$ as the variable.

🏷 Category=Algebra Category=Topology