Perturbation Theory for Linear Operators

This article summarizes results on perturbation theory for linear operators in a finite-dimensional space [@Kato1980, Ch. 2]. The more general perturbation theory in Banach and Hilbert spaces are also handled in the book [@Kato1980, Ch. 7-10]. Proofs are based on writing eigen-projections as contour integrals of the resolvent.

Setup: N-dimensional vector space, X; unperturbed linear operator, T; linear perturbation, x T'; perturbed linear operator, T(x);

Notations: s, number of (distinct) eigenvalues; eigenvalue $\lambda_h$, h = 1, ..., s; algebraic eigenspace (or principal subspace), $M_h$; generalized eigenvector (or principal vector) for $\lambda_h$: any non-zero vector in $M_h$; algebraic multiplicity of $\lambda_h$, $m_h = \dim M_h$; eigen-projections $P_h(x)$, note that $M_h = P_h X$; eigen-nilpotents $D_h(x) = (T(x) - \lambda_h) P_h(x)$; exceptional points, $x_0$, near which the number of eigenvalues is not a constant; simple subdomain: simply connected subdomain containing no exceptional point; λ-group, $\lambda_1(x), ..., \lambda_r(x)$, of multiplicity m, eigenvalues of T(x) generated by splitting from an eigenvalue λ of T(0); total projection for a λ-group, $P(x) = P_1(x) + ... + P_r(x)$, the sum of eigenprojections for all the eigenvalues of T(x) inside a loop; total eigenspace, M(m); cycles, ${\lambda_1(x), ..., \lambda_p(x)}, {\lambda_{p+1}(x), ...}, ..., {...}$

An eigenvalue is simple if it has multiplicity one: $m_h = 1$. A linear operator is simple if all its eigenvalues are simple.

An eigenvalue is semisimple if the associated eigen-nilpotent is zero: $D_h = 0$. A linear operator is diagonalizable, diagonable, nondefective, or semisimple, if it can be written as a direct sum of scalar operators: $T = \sum_h \lambda_h P_h$. A linear operator is semisimple if and only if all its eigenvalues are semisimple.

Analytic perturbation of eigenvalues

Setup: analytic/holomorphic perturbation T'(x), T'(0) = 0, on a domain $D_0$ in the complex x-plane; T(x) = T + T'(x), holomorphic operator-valued function;

Main question in analytic perturbation theory: whether eigenvalues and the eigenvectors of T(x) can be expressed as power series in x, that is, whether they are holomorphic functions of x in the neighborhood of x = 0. If this is the case, the change of the eigenvalues and eigenvectors will be of the same order of magnitude as the perturbation x T' itself for small |x|.

Perturbation series: operator, $T(x) = \sum_{n \in \mathbb{N}} x^n T^{(n)}$, $T^{(0)} = T$; total projection for a λ-group, $P(x) = \sum_{n \in \mathbb{N}} x^n P^{(n)}$, $P^{(0)} = P$;

Summary of qualitative results [@Kato1980, Sec 2.1.8]: number of exceptional points, finite in each compact subset of $D_0$; eigenvalues $\lambda_h(x)$, holomorphic in each simple subdomain, continuous in $D_0$ with only algebraic singularities; eigen-projections $P_h(x)$, holomorphic in each simple subdomain with only algebraic singularities, have common branch points of the same order with $\lambda_h(x)$, always has a pole at a branch point; total projection P(x) for a λ-group, holomorphic at x = 0, total multiplicity equals m; cycle, elements permuted cyclically after analytic continuation along a small circle around x = 0, eigenprojection is single-valued at x = 0 but need not be holomorphic;

Every semisimple eigenvalue varies as several Taylor series at x = 0, such eigenvalues are $C^1$ at x = 0, and their total projections are holomorphic at x = 0 (Thm 2.3).

Linear perturbation gives linear eigenvalues (Thm 2.6).

Convergence radius of power series (Thm 3.2); perturbation of a normal operator on a unitary space (Thm 3.9).

Continuous perturbations

Continuous perturbation gives continuous eigenvalues and continuous total projections for all λ-groups (Thm 5.1). This result on eigenvalues can be extended to perturbation in two or more variables (Sec 5.7).

The unordered N-tuple $\mathfrak{S} = (\lambda_n)_{n \in N}$ consisting of the repeated eigenvalues of T(x) changes with x continuously w.r.t. metric $d(\mathfrak{S}, \mathfrak{S}') = \min_{\pi \in S_N} \max_{n \in N} |\lambda_n - \lambda'_{\pi(n)}|$. In general, it is impossible to define a parametrization, i.e. N single-valued continuous functions $\lambda_n(x)$, that represent the repeated eigenvalues of T(x). If the eigenvalues are always real, the ordered eigenvalues is a parametrization. If x changes over an interval of the real line, a parametrization also exists (Thm 5.2).

If T(x) has N distinct eigenvalues $\lambda_h(x)$ in a simply connected domain of the complex plane or in an interval of the real line, the associated eigenprojections $P_h(x)$ are continuous. In general, $P_h(x)$ cannot be continued beyond a value x where $\lambda_h(x)$ coincides with some other $\lambda_k(x)$.

$\mathfrak{S}$ is a continuous function of T (Thm 5.14), and is partially differentiable at $T = T_0$ if and only if $T_0$ is diagonable (Thm 5.15). If $T_0$ is diagonable and has N distinct eigenvalues, then the eigenvalues of T in a neighborhood of $T_0$ can be expressed by N holomorphic functions (Thm 5.16).

Differentiable perturbations

Differentiable perturbation gives differentiable total projections for all λ-groups, differentiable λ-group eigenvalues of semisimple eigenvalues, and differentiable eigenvalues of diagonable unperturbed operator (as unordered N-tuple $\mathfrak{S}(x)$) (Thm 5.4). This result cannot be extended to total differentiability in two or more variables (Sec 5.7).

$P(x) = P + x P^{(1)} + o(x)$, where $P^{(1)} = - P T^{(1)} S - S T^{(1)} P$ (eq. 2.14), $T^{(1)}$ is the linearized perturbation (linear coefficient; $T(x) = T + x T^{(1)} + o(x)$) and S is the reduced resolvent of T for λ (eq. I-(5.27)), which is the inverse of T - λ in M' = (1 - P) X, that is, (T - λ) S = S (T - λ) = 1 - P and S P = P S = 0.

If an unordered N-tuple $\mathfrak{S}(x)$ of complex numbers is differentiable in a real interval I, then it can be represented by N single-valued differentiable functions $μ_n(x)$ in I (Thm 5.6). If the derivative $\mathfrak{S}'(x)$ is continuous, then $μ_n \in C^1(I, \mathbb{C})$ (Thm 5.7).

If T(x) is differentiable and diagonable on I, then its eigenvalues $\mathfrak{S}(x)$ are differentiable on I. This is not true in the $C^1$ case (Remark 5.8). If T(x) is $C^1$ in a neighborhood of 0, then the total projection for the λ-group of a semisimple eigenvalue is $C^1$ (Remark 5.10).

The troubles that arose about the differentiability of the eigenvalues and eigenvectors of T(x) are solely due to the possibility that the number s(x) of distinct eigenvalues be non-constant. If s(x) is assumed to be constant, all the difficulties disappear and eigenvalues and eigenprojections behave as smoothly as the operator T(x) itself. Similar results hold when T(x) is smooth [@Nomizu1973] or analytic where x is a set of several real or complex variables. (Ch. II Supplementary notes 3, p. 568)

Perturbation of symmetric operators

Symmetric perturbation: $T(x)^* = T(\bar{x})$

Symmetric holomorphic perturbation (in one variable) gives holomorphic eigenvalues and eigenprojections; the eigennilpotents vanish identically (Thm 6.1). This result cannot be extended to two or more variables (Remark 6.3). There exists an orthonormal basis consisting of eigenvectors that are holomorphic (Sec 6.2). The analyticity is essential.

The reduction process preserves symmetry, and under symmetric perturbation, it gives a complete recipe for calculating explicitly the eigenvalues and eigenprojections (Remark 6.4).

If $T \in C^1(I, H)$ where H is a unitary space, then the repeated eigenvalues of T(x) can be represented by N functions $\lambda_n \in C^1(I, \mathbb{R})$ (Thm 6.8).

The unordered N-tuple of repeated eigenvalues $\mathfrak{S}(T)$ as a function of a symmetric operator is partially $C^1$, and is holomorphic where T has N distinct eigenvalues (Sec 6.4).

Reference

Franz Rellich developed the theory of 1-parameter analytic perturbation theory of linear operators.

F. Rellich. Störungstheorie der Spektralzerlegung (Perturbation theory of spectral decomposition), I-V. Mathematische Annalen. 1937-1942.
F. Rellich. Perturbation theory of eigenvalue problems. Lecture Notes, New York Univ. 1953.

For a general study of finite changes of eigenvalues and eigenvectors, see Davis and Kahan's SINUM paper [@Davis1970].

[@Kato1980] Tosio Kato, 1980. Perturbation Theory for Linear Operators. Springer. 2nd edition.

Perturbation theory of linear operators typically deal with only one parameter, because analytic perturbation in multiple parameters may only give continuous eigenvalues (see e.g. [@Kato1980, II-5.7 Ex 5.12]). But when the number of eigenvalues (in a cluster) does not change, the eigenvalues (TODO: needs qualification) and eigenprojections vary as smoothly as the multi-parameter perturbation: analytic, quasi-analytic, or Nash versions using blowings-up [@Parusinski2020]; analytic versions of real symmetric generalized EVD and complex EVD [@SunJG1990, Thm 3.2] and complex two-sided EVD [@ChuKW1990, Sec. 4.1]; smooth version, symmetric @Nomizu1973; C^k version? differentiable version?

SunJG1990

A survey of the sensitivity analysis of multiple eigenvalues and the associated invariant subspaces of two kinds of matrix eigenvalue problems, in which the matrices are analytically dependent on several parameters: (1) symmetric eigenvalue problem (SEP); (2) general eigenvalue problem (GEP) (for semisimple multiple eigenvalues only).

Symmetric eigenvalue problem (SEP): generalized eigenvalue problem of real symmetric matrices (one positive definite), analytically varying with N parameters.

General eigenvalue problem (GEP): eigenvalue problem of a complex matrix, analytically varying with N complex parameters.

Thm 3.1: For any λ-group, the eigenvalues can be written as continuous real-valued functions that have directional derivatives in any direction. Gives an expression for the directional derivatives of each eigenvalue, with a direction-dependent permutation.

Thm 3.2: For any λ-group, there exists a real basis of its eigenspace that is analytic. Gives an expression for the partial derivatives of this basis (eq 3.3).

Remark 3.3: The individual eigenvectors in a λ-group are generally not differentiable.

NOTE: no mentioning of weakening the analyticity assumption.

ChuKW1990

Problem: Right and left eigenspaces of a complex square matrix, depending on N complex parameters.

Sec. 4.1: For any λ-group, there exists a pair of complex bases of its right and left eigenspaces that are complex analytic. The reduced operators (A_1 and \hat{A}_1) are also complex analytic. Gives expressions for the bases and the reduced operators (eq 35-39).

The main results in this paper apply to clusters of nonmultiple eigenvalues as well.

The eigensystems considered in [@SunJG1990] are assumed to be multiple but nondefective, which is not as general as in this paper.

Sec 5: Computation of derivatives.