Notes on linear algebra follows [@Horn & Johnson, 1990. Matrix Analysis.]

Scanned course notes: Notes on linear algebra; Notes on advanced algebra; Notes on vector/linear space;

## Symbols

• $\bar{\mathbb{F}}$: algebraically closed field, e.g. $\bar{\mathbb{Q}}$, $\bar{\mathbb{R}}$, and $\bar{\mathbb{C}}$.
• $\mathbb{R}^n, \mathbb{C}^n$: $n$-dementional vector spaces based on $\mathbb{R}$ and $\mathbb{C}$.
• $\mathcal{B}$, $\{ e_i \}_{i=1}^n$: a basis of a vector space; the standard basis of $\mathbb{R}^n$ or $\mathbb{C}^n$.
• $[v]_{\mathcal{B}}$, $_{\mathcal{B}_1}[T]_{\mathcal{B}_2}$: coordinate representation of a vector in basis $\mathcal{B}$ / a linear transformation in bases $\mathcal{B}_1$ - $\mathcal{B}_2$.
• $A$, $[A]_{I,J}$: a matrix; a submatrix with rows and columns from index sets $I$ and $J$ (default to $I = J$).
• $M_{m,n}(\mathbb{F})$, $M_n(\mathbb{F})$: the set of all $m$-by-$n$ / $n$-by-$n$ matrices over field $\mathbb{F}$, which defaults to $\mathbb{C}$.
• $GL(n, \mathbb{F})$: the general linear group formed by nonsingular matrices in $M_n(\mathbb{F})$.
• $\mathcal{F}$, $\mathcal{U}$: a family of matrices; the set of all unitary matrices, which is a compact subgroup of $GL(n, \mathbb{C})$.
• $A \succeq 0$, $A \ge 0$: positive semi-definite matrix; nonnegative matrix.
• $|A|$: matrix of absolute values of entries of a matrix.
• $A^T, \bar{A}, A^∗, A^\dagger$: transpose / conjugate / Hermitian adjoint (conjugate transpose) / Moore-Penrose pesudoinverse of a matrix.
• $A^{-1}$: inverse of a nonsingular matrix.
• $A^{1/2}$: the unique positive semidefinite square root of a positive semidefinite matrix.
• $\sigma(A)$, $\sigma_i(A)$: the set of singular values of a matrix; the $i$-th largest singular value.
• $\lambda(A)$, $\lambda_i(A)$: spectrum, the set of eigenvalues of a square matrix; the $i$-th eigenvalue, in increasing order if $A$ is Hermitian.
• $\rho(A)$: spectral radius of a square matrix, the largest modulus of its eigenvalues.
• $\mathrm{tr} A$, $\mathrm{det} A$, $\mathrm{adj} A$: trace / determinant / classical adjoint (adjugate) of a square matrix.
• $p_A(t), q_A(t)$: the characteristic / minimal polynomial of a square matrix.

## Vector Space and $\mathbb{F}^n$

Vector space (or linear space) $V$, the fundamental setting for matrix theory, a set closed in linear combination (addition, scalar multiplication) over a (scalar) field, e.g. $\mathbb{R}^n$, $\mathbb{C}^n$, $C^0[0,1]$. Scalar field (域) $\mathbb{F}$, which admits addition and multiplication (with inverses), e.g.$\mathbb{R}$, $\mathbb{C}$.

Subspace $U$ is a subset of $V$ that is also a vector space. Linear span (线性生成空间) $\mathrm{span}(S)$ of a set of vectors $S$, all vectors that can be written as (finite) linear combinations of the set.

Linear independence of a set of vectors, non-zero linear combinations of the set must be non-zero.

Basis $\mathcal{B}$, a linearly independent set such that $\mathrm{Span}(\mathcal{B}) = V$. Dimension $\mathrm{dim}(V)$ of a vector space is the size of any basis of the vector space. The basis may be finite, countably infinite, or uncountably infinite; so is the dimension of the vector space.

Table: The intersection of linearly independent sets and span-to-space sets are bases.

Elements $1, \cdots, \mathrm{dim}(V)$ $\mathrm{dim}(V)$ $\mathrm{dim}(V), \cdots$
Sets linearly independent basis span to $V$

Isomorphism (同构) are vector spaces that can be related by invertible linear functions. For finite dimensional vector spaces $U, V$ over the same field, $U \cong V \iff \dim U= \dim V$. For any basis $\mathcal{B}$ of an $n$-dimensional vector space $V$ over field $\mathbb{F}$, the mapping from a vector to its "coordinates" $f: v \rightarrow [v]_{\mathcal{B}}$ is an isomorphism between $V$ and $\mathbb{F}^n$. The above feature is analogous to random variables. It justifies the re-abstraction (second order abstration) from vector space $V$ to $\mathbb{F}^n$, especially $\mathbb{R}^n$ and $\mathbb{C}^n$.

## Euclidean Space

Euclidean spaces are generalizations of two- and three-dimensional spaces of Euclidean geometry to arbitrary finite-dimensional spaces. It is the abstraction of a geometric object into a topological and algebraic structure. Many concepts in algebra and analysis are developed by analogy with geometry, where Euclidean space is among the first few.

Euclidean space $\mathbb{R}^n$ is the set of all $n$-tuples of real numbers $(x_1,\cdots, x_n)$, where inner product $x \cdot y = \sum_{i=1}^n x_i y_i$.

Length. Angle. Orthogonal.

Orthonormal basis. Gram-Schmidt orthogonalization/orthonormalization, QR decomposition. Orthogonal transformation: rotation, reflection; improper rotation.

Space decomposition. Orthogonal subspace. Direct sum. Orthogonal complement.

Isometry (等距同构), or congruent transformation, is a distance-preserving (bijective) transformation between two metric spaces: given metric spaces $(X, d_x)$ and $(Y, d_y)$, bijection $f: X \to Y$ is an isometry iff $d_x(x_1, x_2) = d_y(f(x_1), f(x_2)), \forall x_1, x_2 \in X$. Two metric spaces are isometric iff there exists an isometry between them.

Theorem: Any finite-dimensional linear inner-product space over the real numbers is isometric with the Euclidean space of the same dimension. $V \cong \mathbb{R}^n$, where vector space $V$ over $\mathbb{R}$ has $\mathrm{dim}(V) = n$ and inner product $\langle \cdot, \cdot \rangle$.

## Linear Transformation and Matrix

Matrix is a rectangular array of scalars in $\mathbb{F}$, which can also be seen as a linear transformation between two vector spaces based on $\mathbb{F}$. In the second perspective, matrices $M_{m,n}(\mathbb{F})$ are all the linear (vector-valued) functions $f: \mathbb{F}^n \to \mathbb{F}^m$.

Matrix multiplication is the composition of linear transformations, which is associative. Matrix addition is the addition of linear transformations.

Scalar matrices are scalar multiple of the identity matrix.

## Four Fundamental Subspaces

For a linear transformation $A \in M_{m,n}(\mathbb{R})$ with rank $r$ and singular value decomposition $A = U \Sigma V^T$, its four fundamental subspaces are summarized in the following table.

Table: Four fundamental subspaces of a linear transformation

Image Kernel Coimage Cokernel
Alternative name Column space Null space Row space Left null space
Notation $\mathrm{im}(A)$ $\mathrm{ker}(A)$ $\mathrm{im}(A^T)$ $\mathrm{ker}(A^T)$
Alt notation $\mathrm{range}(A)$ $\mathrm{null}(A)$ $\mathrm{range}(A^T)$ $\mathrm{null}(A^T)$
Containing Space $\mathbb{R}^m$ $\mathbb{R}^n$ $\mathbb{R}^n$ $\mathbb{R}^m$
Dimension $r$ $n-r$ $r$ $m-r$
Basis $\{u_i\}_{i=1}^r$ $\{v_i\}_{i=r+1}^n$ $\{v_i\}_{i=1}^r$ $\{u_i\}_{i=r+1}^m$

The fundamental theorem of linear algebra:

1. The kernel is the orthogonal complement of the coimage: $\mathrm{ker}(A)^\perp = \mathrm{im}(A^T)$
2. The cokernel is the orthogonal complement of the image: $\mathrm{ker}(A^T) = \mathrm{im}(A)^\perp$

## Matrix Decomposition/Factorization

Matrix decompositions/factorizations express a matrix as the product of smaller or simpler matrices.

• QR decomposition ~ Gram–Schmidt orthogonalization process: $A = Q R$, where $A \in M_{m,n}$, $Q \in M_{m,n}$ is orthogonal/unitary, $R \in M_n$ is upper-triangular; $\{ a_1, \cdots, a_n \} \to \{ q_1, \cdots, q_n \}$;
• LU (LUP) decomposition ~ Gaussian elimination for solving systems of linear equations: $A = L U$ where $L$ is lower-triagular and $U$ is upper-triangular, if $A \in M_n$ and $\mathrm{det}[A]_{1 \dots i} \ne 0, i = 1, \cdots, n$;
• Cholesky decomposition: $A = L L^∗$ where $L$ is lower-triangular, if $A \succeq 0$;
• Jordan decomposition ~ eigenvalue problem: $A = P J P^{-1}$ where $A \in M_n$, $J$ is a Jordan matrix, and $P$ is invertible; if $A$ is real with only real eigenvalues, then $P$ can be real;
• Eigendecomposition (spectral decomposition): $A = P \Lambda P^{-1}$ where $\Lambda$ is diagonal, if $A$ has $n$ linearly independent eigenvectors;
• $A = U \Lambda U^∗$ where $U$ is orthogonal/unitary, if $A$ is a normal matrix (e.g. orthogonal/unitary, (skew-)symmetric/Hermitian);
• QZ decomposition ~ generalized eigenvalue problem;
• Singular value decomposition (SVD) ~ principal components analysis, latent factor analysis: $A = V \Sigma W^∗$, where $A \in M_{m,n}$, $V \in M_m, W \in M_n$ are orthogonal/unitary, and $\Sigma \in M_{m,n}$ is non-negative and has diagonal entries only;

## Eigenvalue Problem and Similarity

An eigenvector of a square matrix is a vector that remains in its linear span after the linear transformation: $A x = \lambda x$. The eigenvalue $\lambda$ that corresponds to an eigenvector is the equivalent scaling when the eigenvector is applied to the linear transformation. The eigenspace that corresponds to an eigenvalue is the subspace where the linear transformation is equivalent to scaling by the eigenvalue: $\{x \mid A x = \lambda x\}$.

Jordan Canonical Form: Any square matrix is similar to a (complex) Jordan matrix, unique up to permutation of Jordan blocks: $A = P J P^{-1}$. In this norm, $\{ p_{1 + n_i} \}_{i = 0}^{k-1}$ is a basis for the direct sum of all the eigenspaces of $A$, which correspond to eigenvalues $\{\lambda_i\}_{i = 1}^k$ respectively. A family of similar matrices, i.e. an equivalence class under similarity, have the same Jordan canonical form.

A Jordan block $J_k(\lambda_i)$ is a $k \times k$ matrix with $\lambda_i$ on the diagonal and $1$ on the super diagonal: $J_k(\lambda_i) = \lambda_i I_k + U_k$, where $U_k$ is the upper/backward shift matrix $U_k = [0, I_{k-1}; 0, 0]$; specially, $J_1(\lambda_i) = [\lambda_i]$. A Jordan matrix is a direct sum of Jordan blocks, i.e. a block diagonal matrix consisting of Jordan blocks: $J = \mathrm{diag}\{ J_{n_i}(\lambda_i) \}_{i=1}^k$, $\sum_{i=1}^k n_i = n$.

The (algebraic) multiplicity of eigenvalue $\lambda_i$ is the multiplicity of it as a zero of the characteristic polynomial $p_A(t)$, i.e. the number of diagonal entries in the Jordan matrix that equal to $\lambda_i$. The geometric multiplicity of eigenvalue $\lambda_i$ is the maximum number of linearly independent eigenvectors associated with it, i.e. the number of Jordan blocks in the Jordan matrix that correspond to $\lambda_i$.

Location and perturbation of eigenvalues.

Two square matrices $A, B \in M_n$ are similar if there is a nonsingular matrix $S \in GL(n)$ s.t. $A = S B S^{-1}$. The transformation $B \to S B S^{-1}$ is called a similarity transformation by similarity matrix $S$. A square matrix is diagonalizable if it is similar to a diagonal matrix, i.e. all Jordan blocks are size one.

A square matrix is invertible iff its eigenvalues do no include zero. A square matrix is diagonalizable iff its eigenvectors can form a basis.