Introduction to numerical analysis
Numerical error: truncation, round off, error propagation,
Numerical linear algebra (NLA).
Classical methods:
Relative condition number for a function $f: X \mapsto Y$ between two normed spaces is its maximum relative sensitivity at a point [@Rice1966]: $\text{cond}(f): X \mapsto \mathbb{R}_{\ge 0}$, $\text{cond}(f, x) = \lim_{\epsilon \to 0+} \sup_{\|\Delta x\| \le \epsilon} \frac{\|f(x + \Delta x) - f(x)\| / \|f(x)\|}{\|\Delta x\| / \|x\|}$; it equals the induced norm of the Fréchet derivative (e.g. Jacobian matrix for vector-valued multivariate functions), normalized by input and output sizes: $\text{cond}(f, x) = \frac{\|Df(x)\|}{\|f(x)\| / \|x\|}$. A function is well-conditioned if the condition number is small and ill-conditioned if the condition number is large. For a range of functions including matrix inverse, matrix eigenvalues, and a root of a polynomial, the condition number is the reciprocal of the relative distance to the nearest singular problem. Absolute condition number is the maximum sensitivity of a function at a point, and is related to relative condition number by $\text{cond}_\text{abs}(f, x) = \text{cond}_\text{rel}(f, x) \cdot \|f(x)\| / \|x\|$.
The relative condition number for matrix inversion $\text{inv}: \text{GL}_n \mapsto \text{GL}_n$ w.r.t. a norm is denoted as $\kappa(A) := \text{cond}(\text{inv}, A)$. Properties: (1) scale-invariant: $\forall c \ne 0$, $\kappa(c A) = \kappa(A)$, which follows from the homogeneity of vector norms. (2) with a matrix norm, it is lower bounded by that of the identity matrix: $\kappa(A) \ge \kappa(I)$, which follows from the sub-multiplicativity of matrix norm. With the mixed subordinate norm, it equals the product of the {p,q}-norm of the matrix and the {q,p}-norm of its inverse: $\kappa_{p,q}(A) = \|A\|_{p,q} \cdot \|A^{-1}\|_{q,p}$. Properties: it is the reciprocal of the relative distance to the nearest singular matrix: $\kappa_{p,q}(A) = \text{dist}_{p,q}(A)^{-1}$. Spectral condition number of an invertible matrix is the condition number for matrix inversion w.r.t. the spectral norm: $\kappa_2(A) = \|A\|_2 \cdot \|A^{-1}\|_2$; it equals the ratio of the largest and the smallest singular value of the matrix: $\kappa_2(A) = \sigma_1(A) / \sigma_n(A)$.
Krylov [kri-lov] subspace methods: (symmetric Lanczos process + LDLt factorization + clever recursions)
Krylov subspace methods are successful only if they can be effectively preconditioned.
Notes: 用高斯消元法求解时,比较计算后和计算前的主对角元,比值越小、解的精度就少多少(?) (Using double precision floating point number) 系数矩阵条件数在百亿量级(1e10)时, 解的精度大概只有1e-3或1e-4。——陈璞
Efficient updates in NLA:
Merit function for solving nonlinear equations is a scalar-valued function that indicates whether a new iterate is better or worse than the current iterate, in the sense of making progress toward a root. Merit functions are often obtained by combining the components of the vector field in some way, e.g. the sum of squares; all of which have some drawbacks.
Line search and trust-region techniques play an equally important role in optimization, but one can argue that trust-region algorithms have certain theoretical advantages in solving nonlinear equations.
Types of function approximation:
Least squares solution of over-determined systems of equations. [@Golub2013, Ch. 5]
QR decomposition (for an m-by-n matrix A):
Householder matrix, Householder reflection, or Householder transformation is a matrix of the form: $P = I - \beta v v^T$, where Householder vector $v$ is non-zero and $\beta = 2 / |v|^2$. When applied to a vector, $P x$ is the reflection of $x$ in the hyperplane orthogonal to $v$.
QR decomposition with pivoting $A \Pi = Q [R; 0]$
Numerical rank of a matrix given a threshold $\tau$ is its smallest rank under perturbations of spectral norm no greater than the threshold: $k_\tau = \min_{|E|_2 \le \tau} \text{rank}(A+E)$. The numerical rank equals the number of singular values greater than the threshold.
[@Hogben2013, Sections 52.9] Rank revealing decomposition is a two-sided orthogonal decomposition of the form: $A = U [R; 0] V^T$, where $U \in O(m)$, $V \in O(n)$, $R$ is upper triangular, and small singular values of A are revealed by correspondingly small diagonal entries of $R$. Examples of rank revealing decompositions include RRQR, and UTV (URV or ULV) decompositions. Rank revealing QR (RRQR) decomposition is a pivoted QR decomposition such that for all leading principal submatrices $R_i$, the following interlacing inequalities hold: $c_i^{-1} \sigma_i(A) \le \sigma_i(R_i) \le \sigma_i(A) \le \sigma_1(R_i) \le c_i \sigma_i(A)$, where $c_i^2 = i (n - i) + \min(i, n-i)$. RRQR factorization is not unique. URV decomposition is a rank revealing decomposition $A = U [R; 0] V^T$ such that for all $k \in {1, \cdots, n}$ and for all $i \le k < j$, leading principal submatrices $R_i$ and trailing principal submatrices $R_{-k}$ (rows and columns from $k+1$ to $n$) satisfy the following sandwich inequalities: $\breve{c}_k \sigma_i(A) \le \sigma_i(R_i) \le \sigma_i(A)$ and $\sigma_j(A) \le \sigma_{j-k}(R_{-k}) \le \breve{c}_k^{-1} \sigma_j(A)$, where $\breve{c}_k^2 = 1 - \sigma_1^2(R_{-k}) / (\sigma_k^2(R_k) - \sigma_1^2(R_{-k}))$. ULV decomposition is similar to URV, but with a lower triangular matrix.