Probabilistic learning on manifold.
A probability distribution concentrated on a subset of the Euclidean space, which is viewed as a manifold.
A synthesis of methods:
Procedure:
, a matrix of n attributes by N observations;[η] = [μ]^(−1/2) [φ]^T [x_0],
where μ are the v positive eigenvalues of covariance matrix [c] = 1/(N-1) [x_0] [x_0]^T,
and φ the corresponding orthonormal eigenvectors;p_H(η) = 1/n \sum_i π_{s'}(η - s'/s η^i),
where π is the v-dimensional Gaussian kernel,
s = (4 / ((v + 2)N)^{1 / (v + 4)} is the optimal Silverman bandwidth,
and s' = s / \sqrt{s^2 + N / (N − 1)};[η] = [z] [g]^T,
where [z] = [η] [a], [a] = [g] ([g]^T [g])^{-1} (so that [z] [g]^T = [η] P_{[g]}),
{g} = [P]^κ {ψ} is the "diffusion maps basis",
[P] = \diag([K] 1)^{−1} [K] is a transition matrix with right eigenvectors {ψ},
[K]_{ij} = k_ε(η^i, η^j) are transition likelihood,
k_ε(x, y) is a kernel (symmetric, non-negative function) with a smoothing parameter ε,
such as the Gaussian kernel k_ε(x, y) = \exp(− (x − y)^2 / (4ε)),
κ is the analysis scale of the local geometric structure of the dataset;
if only the first m diffusion maps basis vectors [g](m) are retained,
then [η](m) = [η] P_{[g](m)} is a reduced-order representation that projects [η] onto [g](m);[η^l] = [Z(lρ)] [g](m)^T, l = 1, 2, ... and ρ = M_0 Δt,
where m satisfies mean-square convergence criterion \|[c](m) - [c]\|_F < ε_0 \|[c]\|_F given ε_0;
[Z] satisfies the reduced-order ISDE (so that [Z] [g](m)^T admits p_{[H](m)}(η)):
d[Z] = [Y] dr; d[Y] = [L]([Z] [g](m)^T) [a](m) dr − f_0/2 [Y] dr + \sqrt{f_0} d[W] [a](m),
with initial condition [Z](0) = [H] [a](m); [Y](0) = [N] [a](m);
Δt = 2 \pi s' / {Fac} is the sampling step of the integration scheme (over-sampled if Fac > 1),
M_0 is a multiplier such that ρ \gg 4 / f_0, the relaxation time of the dynamical system;The Störmer–Verlet discretization scheme preserves energy for non-dissipative Hamiltonian dynamical systems:
[Z_{l+1/2}] = [Z_l] + Δt/2 [Y_l],
[Y_{l+1}] = (1-b)/(1+b) [Y_l] + Δt/(1+b) [L_{l+1/2}] [a](m) + \sqrt{f_0}/(1+b) [ΔW_{l+1}] [a](m),
[Z_{l+1}] = [Z_{l+1/2}] + Δt/2 [Y_{l+1}],
where [L_{l+1/2}] = [L]([Z_{l+1/2}] [g](m)^T) and b = f_0 Δt/4.
Markov stochastic process of a nonlinear second-order dynamical system (dissipative Hamiltonian system)
d[U] = [V] dr; d[V] = [L]([U]) dr − f_0/2 [V] dr + \sqrt{f_0} d[W],
with initial condition [U](0) = [H]; [V](0) = [N],
where [L]([u]) = ( -∇ν(u^j) )_j and ν(u) = - \text{LogSumExp}\{ - (u - s'/s η^i)^2 / (2 s'^2) \},
[W] are N independent v-dimensional normalized Wiener process (increments are standard Gaussian),
[N] are N independent v-dimensional normalized Gaussian vector,
[H] is a random matrix with a realization [η],
f_0 is a dissipation parameter such that the transient response of the ISDE are rapidly killed.
The ISDE has a unique invariant measure and a unique solution
that is a second-order diffusion stochastic process, which is stationary and ergodic,
and such that the probability distribution of random matrix [U] is p_{[H]}(η);
Parameters: (ε, κ, m, f_0, Δt, M_0) (or replace m with ε_0);
(TODO: for discrete distributions)