Notes on linear algebra follows [@Horn & Johnson, 1990. Matrix Analysis.]

Scanned course notes: Notes on linear algebra; Notes on advanced algebra; Notes on vector space;


  • $\bar{\mathbb{F}}$: algebraically closed field, e.g. $\bar{\mathbb{Q}}$, $\bar{\mathbb{R}}$, and $\bar{\mathbb{C}}$.
  • $\mathbb{R}^n, \mathbb{C}^n$: $n$-dementional vector spaces based on $\mathbb{R}$ and $\mathbb{C}$.
  • $\mathcal{B}$, $\{ e_i \}_{i=1}^n$: a basis of a vector space; the standard basis of $\mathbb{R}^n$ or $\mathbb{C}^n$.
  • $[v]_{\mathcal{B}}$, $_{\mathcal{B}_1}[T]_{\mathcal{B}_2}$: coordinate representation of a vector in basis $\mathcal{B}$ / a linear operator in bases $\mathcal{B}_1$ - $\mathcal{B}_2$.
  • $A$, $[A]_{I,J}$: a matrix; a submatrix with rows and columns from index sets $I$ and $J$ (default to $I = J$).
  • $M_{m,n}(\mathbb{F})$, $M_n(\mathbb{F})$: the set of all $m$-by-$n$ / $n$-by-$n$ matrices over field $\mathbb{F}$, which defaults to $\mathbb{C}$.
  • $GL(n, \mathbb{F})$: the general linear group formed by nonsingular matrices in $M_n(\mathbb{F})$.
  • $\mathcal{F}$, $\mathcal{U}$: a family of matrices; the set of all unitary matrices, which is a compact subgroup of $GL(n, \mathbb{C})$.
  • $A \succeq 0$, $A \ge 0$: positive semi-definite matrix; nonnegative matrix.
  • $|A|$: matrix of absolute values of entries of a matrix.
  • $A^T, \bar{A}, A^∗, A^\dagger$: transpose / conjugate / Hermitian adjoint (conjugate transpose) / Moore-Penrose pesudoinverse of a matrix.
  • $A^{-1}$: inverse of a nonsingular matrix.
  • $A^{1/2}$: the unique positive semidefinite square root of a positive semidefinite matrix.
  • $\sigma(A)$, $\sigma_i(A)$: the set of singular values of a matrix; the $i$-th largest singular value.
  • $\lambda(A)$, $\lambda_i(A)$: spectrum, the set of eigenvalues of a square matrix; the $i$-th eigenvalue, in increasing order if $A$ is Hermitian.
  • $\rho(A)$: spectral radius of a square matrix, the largest modulus of its eigenvalues.
  • $\mathrm{tr} A$, $\mathrm{det} A$, $\mathrm{adj} A$: trace / determinant / classical adjoint (adjugate) of a square matrix.
  • $p_A(t), q_A(t)$: the characteristic / minimal polynomial of a square matrix.

Vector Space

Vector space (or linear space) $(V, (+, \cdot_{\mathbb{F}}))$ is a set $V$ closed in linear combination over a (scalar) field $(\mathbb{F}, (+, \cdot))$ (vector addition $+$ and scalar multiplication $\cdot_{\mathbb{F}}$). For example, $\mathbb{R}^n$ and $C^0[0,1]$ are vector spaces with the underlying field $\mathbb{R}$.

Subspace $U$ is a subset of $V$ that is also a vector space. Linear span (线性生成空间) $\mathrm{span}(S)$ of a set of vectors $S$, all vectors that can be written as (finite) linear combinations of the set.

Linear independence of a set of vectors, non-zero linear combinations of the set must be non-zero.

Basis $\mathcal{B}$, a linearly independent set such that $\mathrm{Span}(\mathcal{B}) = V$. Dimension $\mathrm{dim}(V)$ of a vector space is the size of any basis of the vector space. The basis may be finite, countably infinite, or uncountably infinite; so is the dimension of the vector space.

Finite Dimensional Vector Space and $\mathbb{F}^n$

Linear algebra deals with finite dimensional vector spaces.

Table: The intersection of linearly independent sets and span-to-space sets are bases.

Elements $1, \cdots, \mathrm{dim}(V)$ $\mathrm{dim}(V)$ $\mathrm{dim}(V), \cdots$
Sets linearly independent basis span to $V$

Isomorphism (同构) are vector spaces that can be related by invertible linear functions. Finite-dimensional vector spaces $U$ and $V$ over the same field $\mathbb{F}$ are isomorphic iff they have the same dimension: $U \cong V \iff \dim U = \dim V$; in particular, $V \cong \mathbb{F}^n$ where $n = \dim V$. For any basis $\mathcal{B}$ of an $n$-dimensional vector space $V$ over field $\mathbb{F}$, the mapping $f: v \rightarrow [v]_{\mathcal{B}}$ from a vector to its "coordinates" is an isomorphism between $V$ and $\mathbb{F}^n$. The above feature is analogous to random variables. It justifies the re-abstraction (second-order abstration) from vector space $V$ to $\mathbb{F}^n$, especially $\mathbb{R}^n$ and $\mathbb{C}^n$.

Euclidean Space

Euclidean space $(\mathbb{R}^n, (\cdot, \cdot))$ is the set $\mathbb{R}^n$ of all $n$-tuples of real numbers $(x_i)_{i=1}^n, x_i \in \mathbb{R}$ with inner product $x \cdot y = \sum_{i=1}^n x_i y_i$. Euclidean spaces are generalizations of two- and three-dimensional spaces of Euclidean geometry to arbitrary finite-dimensional spaces. It is the abstraction of a geometric object into a topological and algebraic structure. Many concepts in algebra and analysis are developed by analogy with geometry, where Euclidean space is among the first few.

Notes on Euclidean space

Length. Angle. Orthogonal.

Orthonormal basis. Gram-Schmidt orthogonalization/orthonormalization, QR decomposition. Orthogonal transformation: rotation, reflection; improper rotation.

Space decomposition. Orthogonal subspace. Direct sum. Orthogonal complement.

Isometry (等距同构), or congruent transformation, is a distance-preserving (bijective) mapping between two metric spaces: given metric spaces $(X, d_x)$ and $(Y, d_y)$, bijection $f: X \to Y$ is an isometry iff $d_x(x_1, x_2) = d_y(f(x_1), f(x_2)), \forall x_1, x_2 \in X$. Two metric spaces are isometric iff there exists an isometry between them.

Theorem: Any finite-dimensional inner product space over the real numbers is isometric with the Euclidean space of the same dimension. $V \cong \mathbb{R}^n$, where vector space $V$ over $\mathbb{R}$ has $\mathrm{dim}(V) = n$ and inner product $\langle \cdot, \cdot \rangle$.

Linear Operator and Matrix

Linear operator $A: X \to Y$ is an operator between two vector spaces $(X, (+, \cdot_{\mathbb{F}}))$ and $(Y, (+, \cdot_{\mathbb{F}}))$ over a field $\mathbb{F}$, which is compatible with their algebraic structures (vector addition, scalar multiplication): $A(x + y) = Ax + Ay$; $A(a x) = a Ax$. Linear transformation $A: X \to X$ is a linear operator from a vector space $X$ to itself. Linear functional $\alpha: X \to \mathbb{F}$ is a linear operator from a vector space $X$ of functions to its scalar field $\mathbb{F}$. The study of linear operators on finite-dimensional spaces is called Matrix Theory. Examples of linear operators include systems of linear equations, coordinate transformations, differentiation, (indefinite) integration, and the Fourier integral transform.

All linear operators between two given vector spaces $X$ and $Y$ over a scalar field $\mathbb{F}$ form a vector space $\mathcal{L}(X,Y)$. The vector space of linear transformations on $X$ is written as $\mathcal{L}(X)$.

An important question in linear operator theory is classifying endomorphisms (linear transformations) $\mathcal{L}(X)$ w.r.t. some equivalence concept, e.g. similarity, topological equivalence, unitary equivalence.

Matrix is a rectangular array of scalars in $\mathbb{F}$, which can also be seen as the coordinate representation of a linear operator between two vector spaces with the underlying field $\mathbb{F}$, given certain bases. In the second perspective, the class $M_{m,n}(\mathbb{F})$ of matrices are all the linear operators $A: \mathbb{F}^n \to \mathbb{F}^m$. Matrix addition $A+B$ is the addition of linear operators. Matrix multiplication $B A$ is the composition of linear operators, which is associative. Scalar matrices $a I$ are scalar multiple of the identity matrix. Matrix direct sum $A \oplus B$ is the block diagonal matrix $\mathrm{diag}(A, B)$. Kronecker product... Kronecker sum...

Four Fundamental Subspaces of a Linear Operator

For a linear operator $A \in M_{m,n}(\mathbb{R})$ with rank $r$ and singular value decomposition $A = U \Sigma V^T$, its four fundamental subspaces are summarized in the following table.

Table: Four fundamental subspaces of a linear operator

Image Kernel Coimage Cokernel
Alternative name Column space Null space Row space Left null space
Notation $\mathrm{im}(A)$ $\mathrm{ker}(A)$ $\mathrm{im}(A^T)$ $\mathrm{ker}(A^T)$
Alt notation $\mathrm{range}(A)$ $\mathrm{null}(A)$ $\mathrm{range}(A^T)$ $\mathrm{null}(A^T)$
Containing Space $\mathbb{R}^m$ $\mathbb{R}^n$ $\mathbb{R}^n$ $\mathbb{R}^m$
Dimension $r$ $n-r$ $r$ $m-r$
Basis $\{u_i\}_{i=1}^r$ $\{v_i\}_{i=r+1}^n$ $\{v_i\}_{i=1}^r$ $\{u_i\}_{i=r+1}^m$

The fundamental theorem of linear algebra (not the fundamental theorem of algebra):

  1. The kernel is the orthogonal complement of the coimage: $\mathrm{ker}(A)^\perp = \mathrm{im}(A^T)$
  2. The cokernel is the orthogonal complement of the image: $\mathrm{ker}(A^T) = \mathrm{im}(A)^\perp$

Matrix Decomposition/Factorization

Matrix decompositions/factorizations express a matrix as the product of smaller or simpler matrices.

  • QR decomposition ~ Gram–Schmidt orthogonalization process: $A = Q R$, where $A \in M_{m,n}$, $Q \in M_{m,n}$ is orthogonal/unitary, $R \in M_n$ is upper-triangular; $\{ a_1, \cdots, a_n \} \to \{ q_1, \cdots, q_n \}$;
  • LU (LUP) decomposition ~ Gaussian elimination for solving systems of linear equations: $A = L U$ where $L$ is lower-triagular and $U$ is upper-triangular, if $A \in M_n$ and $\mathrm{det}[A]_{1 \dots i} \ne 0, i = 1, \cdots, n$;
    • Cholesky decomposition: $A = L L^∗$ where $L$ is lower-triangular, if $A \succeq 0$;
  • Jordan decomposition ~ eigenvalue problem: $A = P J P^{-1}$ where $A \in M_n$, $J$ is a Jordan matrix, and $P$ is invertible; if $A$ is real with only real eigenvalues, then $P$ can be real;
    • Eigendecomposition (spectral decomposition): $A = P \Lambda P^{-1}$ where $\Lambda$ is diagonal, if $A$ has $n$ linearly independent eigenvectors;
      • $A = U \Lambda U^∗$ where $U$ is orthogonal/unitary, if $A$ is a normal matrix (e.g. orthogonal/unitary, (skew-)symmetric/Hermitian);
  • QZ decomposition ~ generalized eigenvalue problem;
  • Singular value decomposition (SVD) ~ principal components analysis, latent factor analysis: $A = V \Sigma W^∗$, where $A \in M_{m,n}$, $V \in M_m, W \in M_n$ are orthogonal/unitary, and $\Sigma \in M_{m,n}$ is non-negative and has diagonal entries only;

Similarity and the Eigenvalue Problem

Two square matrices $A, B \in M_n$ are similar if they are matrix representations of the same linear operator in different bases: $\exists S \in GL(n): A = S B S^{-1}$. The transformation $B \to S B S^{-1}$ is called a similarity transformation by similarity matrix $S$.

An eigenvector of a square matrix is a vector that remains in its linear span after the linear transformation: $A x = \lambda x$. The eigenvalue $\lambda$ that corresponds to an eigenvector is the equivalent scaling when the eigenvector is applied to the linear transformation. The eigenspace that corresponds to an eigenvalue is the subspace where the linear transformation is equivalent to scaling by the eigenvalue: $\{x \mid A x = \lambda x\}$.

Jordan Canonical Form: Any square matrix is similar to a (complex) Jordan matrix, unique up to permutation of Jordan blocks: $A = P J P^{-1}$. In this norm, $\{ p_{1 + n_i} \}_{i = 0}^{k-1}$ is a basis for the direct sum of all the eigenspaces of $A$, which correspond to eigenvalues $\{\lambda_i\}_{i = 1}^k$ respectively. A family of similar matrices, i.e. an equivalence class under similarity, have the same Jordan canonical form.

A Jordan block $J_k(\lambda_i)$ is a $k \times k$ matrix with $\lambda_i$ on the diagonal and $1$ on the super diagonal: $J_k(\lambda_i) = \lambda_i I_k + U_k$, where $U_k$ is the upper/backward shift matrix $U_k = [0, I_{k-1}; 0, 0]$; specially, $J_1(\lambda_i) = [\lambda_i]$. A Jordan matrix is a direct sum of Jordan blocks, i.e. a block diagonal matrix consisting of Jordan blocks: $J = \mathrm{diag}\{ J_{n_i}(\lambda_i) \}_{i=1}^k$, $\sum_{i=1}^k n_i = n$.

The (algebraic) multiplicity of eigenvalue $\lambda_i$ is the multiplicity of it as a zero of the characteristic polynomial $p_A(t)$, i.e. the number of diagonal entries in the Jordan matrix that equal to $\lambda_i$. The geometric multiplicity of eigenvalue $\lambda_i$ is the maximum number of linearly independent eigenvectors associated with it, i.e. the number of Jordan blocks in the Jordan matrix that correspond to $\lambda_i$.

A square matrix is diagonalizable if it is similar to a diagonal matrix (Jordan blocks all size one). A square matrix is invertible iff its eigenvalues do no include zero. A square matrix is diagonalizable iff its eigenvectors can form a basis.

Location and perturbation of eigenvalues.

Miscellaneous Topics

🏷 Category=Algebra