Loading [MathJax]/jax/element/mml/optable/SpacingModLetters.js

This article is about Diffusion processes and harmonic analysis on manifolds, in continuous and finite versions.

Geometry of a set Γ is a set of rules that describe the relationship between its elements. Intrinsic geometry (本征几何) is one that does not refer to a superset or the structures therein. Extrinsic geometry (表征几何) is one that is induced from the geometry of a superset. The geometry of a set can be understood through the geometry of the space of functions on the set (and spaces of functionals and operators). This is a dual perspective in functional analysis, from inverse problems such as inverse scattering, potential theory, and spectral geometry. [@Lafon2004] uses the language of functional analysis.

Symbols

  • Γ: a set, either finite (graph) or continuous (manifold).
  • dx: the Lebesgue measure (Rn); the Riemannian volume (Riemannian manifold).
  • (Γ,dμ): a measure space with point set Γ and measure dμ.
  • f, F: a function (restricted to) a manifold Γ; a function (extended to) the ambient space Rn;
  • L2(Γ,dμ): the space of square integrable functions on Γ. fL2(Γ,dμ)Γ|f(x)|2 dμ(x)<.
  • Hs(Γ,dμ): a Sobolev space of functions on Γ. fHs(Γ,dμ)|r|s,rfL2(Γ,dμ). Here |r|=iri,riN.
  • f,gΓ: inner product of functions in L2(Γ,dμ). f,gΓ=Γ¯f(x)g(x) dμ(x).
  • fs: norm of a function in Hs(Γ,dμ). f2s=|r|sΓ|rf(x)|2 dμ(x).
  • k(x,y), pt(x,y): an integral kernel; the Neumann heat kernel on a manifold.
  • v2(x): integral transform of 1 with kernel k(x,y).
  • ˜a(x,y): a Markov kernel, transition probability of a diffusion process.
  • a(x,y): a diffusion kernel, the symmetric conjugate to ˜a by v(x).
  • K, ˜A, A: an integral operator; an averaging operator; a diffusion operator.
  • Δ: the Laplacian (Rn); the Laplace-Beltrami operator (Riemannian manifold). Signs are defined so that Δ is positive semi-definite.
  • {ϕi}, {ν2i}: eigenfunctions and eigenvalues of Δ, i.e. Δϕi=ν2iϕi.
  • Φ, Φk: diffusion map; diffusion map truncated to dimension k.

Diffusion Process on Manifolds

Integral kernel k(x,y) defines an integral transform or integral operator K for functions on measure space (Γ,dμ), such that Kf(x)=Γk(x,y)f(y) dμ(y). Here we call an integral kernel admissible, if it is symmetric, positivity-preserving, and positive semi-definite: k(x,y)=k(y,x); k0; f,Kf0. Admissible kernels can be designed to represent a local metric (e.g. degree of similarity) or exhibit a local behavior.

Markov kernel or stochastic kernel ˜a(x,y) defines a Markov process (an averaging operator) ˜A, where ˜a(x,) is a probability measure. Every positivity-preserving integral kernel can be transformed into a Markov kernel ˜a(x,y)=k(x,y)/v2(x), where v2(x)=K1. Assume the Markov kernel has eigendecomposition ˜a(x,y)=iNλiψi(x)ˆψi(y). The right eigenvectors {ψi} (and left eigenvectors {ˆψi}) need not be orthogonal.

Diffusion kernel a(x,y) is the symmetric conjugate of a Markov kernel ˜a by v, a(x,y)=v(x)˜a(x,y)/v(y). Diffusion operator A with kernel a is positive semi-definite, bounded, with the largest eigenvalue 1 and a corresponding eigenfunction v(x). Assume that A is compact (not just bounded), then its spectrum is discrete and its eigendecomposition can be written as a(x,y)=iNλiϕi(x)ϕi(y), where Aϕi(x)=λiϕi(x). The diffusion operator A and the averaging operator ˜A have the same spectrum λ, and the eigenfunctions of A and the right/left eigenfunctions of ˜A are related by conjugation by v: ψi=diag(v1)ϕi, ˆψi=diag(v)ϕi. t-step diffusion operator At aggregates the local information, and its kernel, the t-step diffusion kernel a(t), has eigendecomposition a(t)(x,y)=iNλtiϕi(x)ϕi(y).

Given Γ is a compact smooth submanifold of dimension d in Rn, i.e. a Riemannian manifold with Riemannian metric induced from the ambient space Rn. Let μ be a probability measure on Γ and dx the Riemannian measure on Γ, then probability density p(x) is given by p(x)=dμ(x)/dx. Let {ei}di=1 be an orthonormal basis of the tangent plane Tx to Γ at x, and the exponential map expx of the coordinates of the tangent plane forms a chart on Γ around x, which provides normal coordinates (si)di=1. Laplace-Beltrami operator Δ on the Riemannian manifold can be written as Δf=di=12f/s2i, where fC(Γ) is a smooth function on the manifold. Neumann heat operator on the manifold (heat diffusion on the manifold with the Neumann boundary condition) is etΔ, which corresponds to the Neumann heat kernel pt(x,y) when seen as an integral operator.

Heat Kernels on Manifolds

For the Euclidean n-space, the heat kernel is the Gaussian probability density function with scalar variance twice the diffusion time: ˉpt(d)=(4πt)n/2ed2/(4t), where distance d=|xy|; in short, ˉpt=ϕ(0,2tIn), where ϕ is the PDF of a Gaussian random vector.

For the n-sphere, the heat kernel is a function of the spherical distance θ=arccosx,y. It does not have a closed form, but can be written in series approximations. The heat kernel can be bounded as [@Nowak2019]: given T>0, θ[0,π], t(0,T], p˚, where the constants depend only on n and T; in short, \mathring{p}_t(\theta) \asymp \mathcal{O}\left((t+ \pi - \theta)^{-(n-1)/2} \bar{p}_t (\theta) \right). Note that \asymp denotes order equivalence.

For the hyperbolic n-space, the heat kernel is [@Davies1988]: (1) if n = 2m + 1, \breve p_t (d) = \frac{(-1)^m e^{-(n-1)^2 t / 4}}{\sqrt{(2\pi)^{n} (2 t)}} \left(\frac{1}{\sinh d} \frac{\partial}{\partial d}\right)^m e^{- d^2 / (4t)}; in short, \breve p_t (d) = (-2 t)^{m} e^{-m^2 t} \left(\frac{1}{\sinh d} \frac{\partial}{\partial d}\right)^m \bar p_t (d); (2) if n = 2m + 2, \breve p_t (d) = \frac{(-1)^m e^{-(n-1)^2 t / 4}}{\sqrt{(2\pi)^{n+1} (4 t^3)}} \left(\frac{1}{\sinh d} \frac{\partial}{\partial d}\right)^m \int_d^\infty \frac{s e^{- s^2 / (4t)}}{\sqrt{\cosh s - \cosh d}} \text{d}s. The heat kernels satisfy the recurrence relation: \breve p_{n+2} = \frac{-e^{-nt}}{2\pi} \left(\frac{1}{\sinh d} \frac{\partial}{\partial d}\right) \breve p_n.

Diffusion Maps

Diffusion processes (or random walks, Markov processes) can be useful for the geometric descriptions of data sets. [@Coifman2006a] approximates (anisotropic) diffusion processes \exp(-t H_\alpha) on submanifolds, where H_\alpha = \Delta - \Delta(p^{1-\alpha}) / p^{1-\alpha} is a symmetric Schrödinger operator and parameter \alpha \in \mathbb{R}, with special cases: \alpha = 1, heat diffusion; \alpha = 1/2, a Fokker–Planck diffusion; \alpha = 0, normalized graph Laplacian with isotropic weights. In comparison, "kernel eigenmaps" are diffusion processes \exp(-t Q_2^{-1} Q_1).

Diffusion distance or diffusion metric D_t(x, y) at time step t is the semi-metric defined as: D_t^2 (x, y) \equiv a^{(t)}(x, x) + a^{(t)}(y, y) - 2 a^{(t)}(x, y), which is equivalent to D_t^2 (x, y) = \sum_{i \in \mathbb{N}} \lambda_i^t [\phi_i(x) - \phi_i(y)]^2 and D_t^2(x,y) = \| a^{(t/2)}(x,\cdot) - a^{(t/2)}(y,\cdot) \|^2. The diffusion distance becomes a metric if the diffusion kernel a is positive definite. Unlike the geodesic distance, the diffusion distance is an average over all paths connecting two points, and thus robust to noise and topological short-circuits.

Diffusion map \Phi: \Gamma \to l^2(\mathbb{N}) is a mapping of the data to a Euclidean space, based on the eigenfunctions of the diffusion kernel: \Phi(x) = \left( \phi_i(x) \right)_{i \in \mathbb{N}}. Diffusion map \Phi defines an embedding of the data that preserves the diffusion distance D_t via a weighted Euclidean metric D_t^2(x,y) = \langle \Phi(x) - \Phi(y), \Lambda^t (\Phi(x) - \Phi(y)) \rangle. A truncated diffusion map \Phi_k: \Gamma \to \mathbb{R}^k with \Phi_k(x) = \left( \phi_i(x) \right)_{i=1}^k preserves the diffusion distance with certain accuracy. The dimension k of the new representation depends on the diffusion process, and is in general greater than the dimension d of the set/manifold.

(Inverse mapping.)

Heat Diffusion

Spectral graph theory analyzes the spectrum (eigenvalues and eigenvectors) of a matrix representing a graph. For example, spectral clustering or spectral partitioning: construct a matrix representation of the graph; eigendecompose the matrix and embed the vertices to a Euclidean space using one or more eigenvectors; group the points based on their coordinates. Laplacian matrix of a graph, aka graph Laplacian, is its degree matrix minus its adjacency matrix, L = D - A, where D = \mathrm{diag}\{d_i\}_{i=1}^n. The Laplacian matrix is a discrete analog of the Laplacian operator Δ in multivariable calculus. The Laplacian matrix is symmetric, so its eigenvectors can form an orthogonal basis; additionally, all eigenvalues are non-negative. Symmetric normalized graph Laplacian is L' = D^{-1/2} L D^{-1/2} = I - D^{-1/2} A D^{-1/2}. Consider rotation-invariant kernels k_\varepsilon (x, y) = h(\|x-y\|^2 / \varepsilon) with scale parameter \varepsilon, where h is smooth and decays exponentially, and \|\cdot\| is the Euclidean metric. As \varepsilon goes to 0, the local geometry specified by k_\varepsilon coincide with that of the manifold. Given an averaging operator \tilde{A}_\varepsilon f(x) = K_\varepsilon f(x) / v_\varepsilon^2 (x) where v_\varepsilon^2 (x) = K_\varepsilon 1, the weighted graph Laplacian operator is \tilde{\Delta}_\varepsilon = (I - \tilde{A}_\varepsilon) / \varepsilon. On certain subspaces E_K of L^2(\Gamma), as \varepsilon goes to 0, the limit operator of \tilde{\Delta}_\varepsilon is H f = c (\Delta f + 2 \langle \nabla p / p , \nabla f \rangle), where c = m_2 / (2 m_0), m_0 = \int_{\mathbb{R}^d} h(\|u\|^2)~\mathrm{d}u, and m_2 = \int_{\mathbb{R}^d} u_i^2 h(\|u\|^2)~\mathrm{d}u. Under a conjugation by the density p, we get p H(g/p) = c (\Delta g - g \nabla p / p), where g = p f. Thus, when the density p is uniform, the weighted graph Laplacian \tilde{\Delta}_\varepsilon converges to (a multiple of) the Laplace-Beltrami operator \Delta on the manifold [@Belkin2003]; but when the density is non-uniform, such reconstruction does not hold.

To separate the distribution p on the manifold from its intrinsic geometry k_\varepsilon, use kernel \tilde{k}_\varepsilon(x, y) = k_\varepsilon(x, y)/(p_\varepsilon(x) p_\varepsilon(y)) where p_\varepsilon(x) = K_\varepsilon 1. This leads to an averaging operator \bar{A}_\varepsilon f(x) = \tilde{K}_\varepsilon f(x) / \tilde{v}_\varepsilon^2 (x) where \tilde{v}_\varepsilon^2 (x) = \tilde{K}_\varepsilon 1. Define a Laplace operator \Delta_\varepsilon = (I - \bar{A}_\varepsilon) / \varepsilon acting on the subspaces E_K of L^2(\Gamma), its limit operator \lim_{\varepsilon \to 0} \Delta_\varepsilon \equiv \Delta_0 = \Delta / c is a multiple of the Laplace-Beltrami operator. The limit operator of \bar{A}_\varepsilon^{t/\varepsilon} is the Neumann heat operator e^{−t \Delta_0}; in other words, Markov kernel \bar{a}_\varepsilon^{(t/\varepsilon)}(x, y) with small \varepsilon, close to a fine scale Gaussian kernel, approximates the (symmetric) Neumann heat kernel p_t (x, y), regardless of knowledge of the boundary. The averaging operator \bar{A}_\varepsilon is compact and its eigendecomposition can be written as \bar{A}_\varepsilon = \sum_{i \in \mathbb{N}} \lambda_{\varepsilon, i} P_{\varepsilon, i}, where P_{\varepsilon, i} is the orthogonal projector on the eigenspace corresponding to eigenvalue \lambda_{\varepsilon, i}. If the eigendecomposition of the Laplace-Beltrami operator is \Delta = \sum_{i \in \mathbb{N}} \nu_i^2 P_i, where P_i is the orthogonal projector on the eigenspace corresponding to eigenvalue \nu_i^2. Then the eigenfunctions and eigenvalues of \bar{A}_\varepsilon^{t/\varepsilon} converge to those of the heat operator e^{-t\Delta}: \lim_{\varepsilon \to 0} \lambda_{\varepsilon, i}^{t/\varepsilon} = e^{-t\nu_i^2}, \lim_{\varepsilon \to 0} P_{\varepsilon, i} = P_i. Let \{\phi_i\}_{i \in \mathbb{N}} be the orthonormal eigenfunctions of the Laplace-Beltrami operator (also of the heat kernel) on the manifold \Gamma, then the eigendecomposition of the heat kernel is p_t(x, y) = \sum_{i \in \mathbb{N}} e^{-t \nu_i^2} \phi_i(x) \phi_i(y). Although \{\phi_i\}_{i \in \mathbb{N}} is usually viewed as a Hilbert basis of L^2(Γ), it also forms a set of coordinates on the submanifold Γ via diffusion maps \Phi, which accurately preserves the heat diffusion distance with dimension k \ge d.

Numerical procedure for approximate Neumann heat diffusion: (Note that skipping steps 3-4 provides the weighted graph Laplacian.)

  1. Choose scale parameter \varepsilon to be (on the order of) the larger of: (1) the mean square distance within the set, \sum_x m(x) / N \min_{y \ne x} \|x - y\|^2, where m(x) is the number of observations identical to x in the data; and (2) the square of the perturbation size \|\eta\|^2, i.e. trace of the covariance matrix of the perturbation. Note that small \varepsilon may result in a disconnected graph.
  2. Construct operator [K_\varepsilon] with entries [K_\varepsilon]_{xy} = k_\varepsilon(x, y), where k_\varepsilon(x, y) = e^{-\|x-y\|^2/\varepsilon} is a Gaussian kernel.
  3. Compute approximate density p_\varepsilon = [K_\varepsilon] \mathbf{1} by substituting true density p(x) with the empirical measure.
  4. Compute density-neutralizing operator [\tilde{K}_\varepsilon] = \mathrm{diag}(p_\varepsilon^{-1}) [K_\varepsilon] \mathrm{diag}(p_\varepsilon^{-1}). It can be implemented as [K_\varepsilon] / [p_\varepsilon p_\varepsilon^T], where / is element-wise division.
  5. Compute normalizing vector \tilde{v}_\varepsilon^2 = [\tilde{K}_\varepsilon] \mathbf{1}. Take a square root to get \tilde{v}_\varepsilon.
  6. Compute diffusion operator [A_\varepsilon] = \mathrm{diag}(\tilde{v}_\varepsilon^{-1}) [\tilde{K}_\varepsilon] \mathrm{diag}(\tilde{v}_\varepsilon^{-1}). It can be implemented as [\tilde{K}_\varepsilon] / [\tilde{v}_\varepsilon \tilde{v}_\varepsilon^T].
  7. Eigendecompose [A_\varepsilon] = \Phi \Lambda_\varepsilon \Phi^T. The Neumann heat operator has approximate spectrum \lambda_\varepsilon and eigenvectors \{\phi_i / \phi_1\}_{i=1}^N. Note that A_\varepsilon \tilde{v}_\varepsilon = \tilde{v}_\varepsilon, so \phi_1 and \tilde{v}_\varepsilon are proportional.

Anisotropic Diffusion

For differential and dynamical systems.

Geometric Harmonics

Diffusion processes are also useful for harmonic analysis of functions on data sets. Geometric harmonics is a set of functions that allows out-of-sample extension of empirical functions on the data set.

Bi-Lipschitz Parametrization

Geometric harmonics provide a simple solution to the relaxed distortion problem, an embedding \Psi that is bi-Lipschitz with a small distortion.

Atlas Computation:

  1. ...

Eigenfunction selection:

  1. ...

Extension and Restriction

Intrinsic geometry, Fourier analysis of the manifold (eigenfunctions of the Laplace-Beltrami operator). Extrinsic geometry, Fourier analysis of the ambient space.

Restriction.

Extension (a version of the Heisenberg uncertainty principle): extend the eigenfunctions to a band-limited function of band \mathcal{O}(\nu_i), localized in a tube of radius \mathcal{O}(1/\nu_i) around the manifold.

(Extension of f = 1 for discription of the manifold?)

Multiscale extension: empirical functions are decomposed into frequency bands, and each band is extended to a certain distance so that it satisfies some version of the Heisenberg principle.

(Intrinsic and extrinsic diffusion.)

Multiscale extension scheme:

  1. ...

🏷 Category=Manifold