Matrix variate and Matrix valued Functions

This article discusses mappings whose domain or codomain contains a space of matrices.

Matrix-variate function

Invariant theory of matrices study matrix-variate functions that are invariant under the transformations from a given linear group. Common invariant functions are usually polynomials polynomials in the matrix entries; note that this is different from matrix polynomials.

Conjugation-invariant function, aka similarity-invariant function, of a square matrix is a function that is invariant under similarity transforms: $f: M_{n,n}(\mathbb{C}) \mapsto \mathbb{C}$, $f(A) = f(P A P^{-1})$, $\forall P \in \text{GL}_n$. If the function is continuous, then it is conjugation-invariant if and only if it commutes: $f \in C^0(M_{n,n}(\mathbb{C}))$, $f(A B) = f(B A)$. A function of square matrices is conjugation-invariant if and only if it can be written as a function of eigenvalues (counting all algebraic multiplicities): for any $A \in M_{n,n}(\mathbb{C})$, let $A = P J P^{-1}$ be a Jordan canonical form, denote eigenvalues $\lambda = \text{diag}(J)$, then there is an n-variate function $g: \mathbb{C}^n \mapsto \mathbb{C}$ such that $f(A) = g(\lambda)$. Some common similarity/conjugation-invariant functions: (1) trace, $\text{tr}(A) = \sum_{i=1}^n \lambda_i$, compute in n-1 additions; (2) determinant, $\det(A) = \prod_{i=1}^n \lambda_i$, compute in $\mathcal{O}(n^\alpha)$ flops, where exponent $\alpha \in (2, 3]$ is the same for matrix multiplication, which has an asymptotic upper bound 2.373 [@LeGall2014]; (3) trace of a power, $\text{tr}(A^k) = \sum_{i=1}^n \lambda_i^k$, where $k \in \mathbb{N}$, compute in $\mathcal{O}(k n^\alpha)$ flops due to (k-1) matrix multiplications; (4) spectral radius, $\rho(A) = \max(|\lambda_i|)_{i=1}^n$, compute in $\mathcal{O}(n^2)$ flops per iteration.

Orthogonal-invariant function of a real matrix, or unitary-invariant function of a complex matrix, is a function that is invariant under left and right actions of orthogonal matrices: $f: M_{m,n} \mapsto \mathbb{F}$, $f(M) = f(Q_m A Q_n)$, $\forall Q_m \in O(m)$, $\forall Q_n \in O(n)$. A matrix-variate function is orthogonal/unitary-invariant if and only if it can be written as a function of singular values: for any $M \in M_{m,n}(\mathbb{F})$, $m \ge n$, let $M = U \Sigma V$ be a full singular value decomposition, denote singular values $\sigma = \text{diag}(\Sigma) \in \mathbb{R}^m_{+\downarrow}$, then there is an m-variate function $g: \mathbb{R}^m_{+\downarrow} \mapsto \mathbb{R}$ such that $f(M) = g(\lambda)$. Some common orthogonal functions: (1) Frobenious/Euclidean norm, $\|M\|_F = (\sum_{i=1}^m \sigma_i^2)^{1/2}$, compute in (m n) multiplications and (m n - 1) additions; (2) spectral norm, $\|M\|_2 = \sigma_1$, compute in $\mathcal{O}(m n)$ flops. Note that the condition number of a nonsingular matrix w.r.t. the spectral norm, $\kappa(A) = \|A\|_2 \|A^{-1}\|_2$ where $A \in \text{GL}_n$, equals the ratio of the largest and the smallest singular values: $\kappa(A) = \sigma_1 / \sigma_n$, and therefore it is orthogonal-invariant. If A is a normal matrix, then $\sigma_1 = \rho(A)$ and $\sigma_n = \rho(A^{-1})$, and therefore its condition number is conjugation-invariant.

Hypergeometric function with a matrix argument ${}_{p}F_{q}: \mathbb{R}^p \times \mathbb{R}^q \times \mathcal{S}(n) \mapsto \mathbb{R}$, where $p, q \in \mathbb{N}$, is a family of functions recursively defined via the multivariate Laplace and inverse Laplace transforms [@Herz1955]: let ${}_{0}F_{0}(S) = \exp(\text{tr}(S))$, $a = (a_i)_{i=1}^p$, and $b = (b_j)_{j=1}^q$, define ${}_{p+1}F_{q}((a, c), b, Y) = \int_{\mathcal{S}_+(n)} |S|^{c - (n+1)/2} {}_{0}F_{0}(-S) {}_{p}F_{q}(a, b, Y S) (d S)$ and ${}_{p}F_{q+1}(a, (b, c), S) = (2 \pi i)^{-n(n+1)/2} 2^{n(n-1)/2} \Gamma_n(c) \int_{\mathcal{S}_+(n)} |Y|^{-c} {}_{0}F_{0}(Y) {}_{p}F_{q}(a, b, S Y^{-1}) (d Y)$, where (d S) denotes the Lebesgue measure. It can also be defined for complex arguments and values, but in the second recursive formula, only the real part of Y needs to be positive-definite. Special cases: ${}_{1}F_{0}(a, S) = |I_n - S|^{-a}$; ${}_{0}F_{1}(m/2, X X^T) = \int_{O(m)} \exp(\text{tr}(2 X H)) [d H]$, where [d H] denotes the normalized invariant measure on O(m). It has a series representation using zonal polynomials [@Constantine1963]: Let $\lambda \vdash l$ denote that λ is an ordered partition of a natural number l: $\lambda \in \mathbb{Z}^m_{+\downarrow} \cap (l \Delta^{m-1})$, $m \le l \in \mathbb{N}$; note that with $\lambda = (l_i)_{i=1}^m$, we have $l_1 \ge \cdots \ge l_m > 0$ and $\sum_{i=1}^m l_i = l$. Let generalized hypergeometric coefficient $(a)_\lambda = \prod_{i=1}^m (a - (i-1)/2)_{l_i}$, where a is a scalar, $(a)_l = \prod_{i=0}^{l-1} (a + i) = {}_{a+l} P_{l}$ equals l-permutations of a+l, and $(a)_0 = 1$. A hypergeometric function with a matrix argument can be written as: ${}_{p}F_{q}(a, b, S) = \sum_{l=0}^\infty \sum_{\lambda \vdash l} \prod_{i=1}^p (a_i)_\lambda \prod_{j=1}^q (b_j)_\lambda^{-1} C_\lambda(S) / l!$, where $C_\lambda: \mathcal{S}_+(n) \mapsto \mathbb{R}$ is a normalized zonal polynomial. The compuation of hypergeometric functions with a matrix argument is intractable in general: with an order-n Hermitian matrix argument, its m-th order truncation can be computed in $\mathcal{O}(P_{mn}^2 n)$ time, where $P_{mn}^2 = \mathcal{O}(\exp(2\pi \sqrt{2m/3}))$ is sub-exponential [@Koev2006].

Matrix function

Matrix function is a transformation on a space of square matrices: $f: M_{n,n} \mapsto M_{n,n}$. Every analytic function can be used to define a matrix function, see e.g. @Higham2008. Matrix polynomial $p(A)$ is a polynomial of square matrices.

Matrix exponential $e^A$ of an n-by-n matrix is the n-by-n matrix defined by the Taylor series of the exponential function, substituting the variable with the matrix: $e^A = \sum_{n=0}^{\infty} A^n / n!$. Given a Jordan decomposition of a matrix, its matrix exponential can be reduced to the exponential of its Jordan canonical form: $A = T J T^{-1}$ then $e^A = T e^J T^{-1}$. Note that, $\exp\left[t \begin{pmatrix}\lambda & 1 \\ 0 & \lambda \end{pmatrix}\right] = e^{\lambda t} \begin{pmatrix}1 & t \\ 0 & 1 \end{pmatrix}$, and $\exp\left[t \begin{pmatrix}a+ib & 0 \\ 0 & a-ib \end{pmatrix}\right] = e^{a t} \begin{pmatrix}\cos bt & -\sin bt \\ \sin bt & \cos bt \end{pmatrix}$.

Matrix logarithm $\log(A)$ is the inverse map of the matrix exponential.