Probabilistic learning on manifold.
A probability distribution concentrated on a subset of the Euclidean space, which is viewed as a manifold.
A synthesis of methods:
Procedure:
, a matrix of n
attributes by N
observations;[η] = [μ]^(−1/2) [φ]^T [x_0]
,
where μ
are the v
positive eigenvalues of covariance matrix [c] = 1/(N-1) [x_0] [x_0]^T
,
and φ
the corresponding orthonormal eigenvectors;p_H(η) = 1/n \sum_i π_{s'}(η - s'/s η^i)
,
where π
is the v
-dimensional Gaussian kernel,
s = (4 / ((v + 2)N)^{1 / (v + 4)}
is the optimal Silverman bandwidth,
and s' = s / \sqrt{s^2 + N / (N − 1)}
;[η] = [z] [g]^T
,
where [z] = [η] [a]
, [a] = [g] ([g]^T [g])^{-1}
(so that [z] [g]^T = [η] P_{[g]}
),
{g} = [P]^κ {ψ}
is the "diffusion maps basis",
[P] = \diag([K] 1)^{−1} [K]
is a transition matrix with right eigenvectors {ψ}
,
[K]_{ij} = k_ε(η^i, η^j)
are transition likelihood,
k_ε(x, y)
is a kernel (symmetric, non-negative function) with a smoothing parameter ε
,
such as the Gaussian kernel k_ε(x, y) = \exp(− (x − y)^2 / (4ε))
,
κ
is the analysis scale of the local geometric structure of the dataset;
if only the first m
diffusion maps basis vectors [g](m)
are retained,
then [η](m) = [η] P_{[g](m)}
is a reduced-order representation that projects [η]
onto [g](m)
;[η^l] = [Z(lρ)] [g](m)^T
, l = 1, 2, ...
and ρ = M_0 Δt
,
where m
satisfies mean-square convergence criterion \|[c](m) - [c]\|_F < ε_0 \|[c]\|_F
given ε_0
;
[Z]
satisfies the reduced-order ISDE (so that [Z] [g](m)^T
admits p_{[H](m)}(η)
):
d[Z] = [Y] dr; d[Y] = [L]([Z] [g](m)^T) [a](m) dr − f_0/2 [Y] dr + \sqrt{f_0} d[W] [a](m)
,
with initial condition [Z](0) = [H] [a](m); [Y](0) = [N] [a](m)
;
Δt = 2 \pi s' / {Fac}
is the sampling step of the integration scheme (over-sampled if Fac > 1
),
M_0
is a multiplier such that ρ \gg 4 / f_0
, the relaxation time of the dynamical system;The Störmer–Verlet discretization scheme preserves energy for non-dissipative Hamiltonian dynamical systems:
[Z_{l+1/2}] = [Z_l] + Δt/2 [Y_l]
,
[Y_{l+1}] = (1-b)/(1+b) [Y_l] + Δt/(1+b) [L_{l+1/2}] [a](m) + \sqrt{f_0}/(1+b) [ΔW_{l+1}] [a](m)
,
[Z_{l+1}] = [Z_{l+1/2}] + Δt/2 [Y_{l+1}]
,
where [L_{l+1/2}] = [L]([Z_{l+1/2}] [g](m)^T)
and b = f_0 Δt/4
.
Markov stochastic process of a nonlinear second-order dynamical system (dissipative Hamiltonian system)
d[U] = [V] dr; d[V] = [L]([U]) dr − f_0/2 [V] dr + \sqrt{f_0} d[W]
,
with initial condition [U](0) = [H]; [V](0) = [N]
,
where [L]([u]) = ( -∇ν(u^j) )_j
and ν(u) = - \text{LogSumExp}\{ - (u - s'/s η^i)^2 / (2 s'^2) \}
,
[W]
are N
independent v
-dimensional normalized Wiener process (increments are standard Gaussian),
[N]
are N
independent v
-dimensional normalized Gaussian vector,
[H]
is a random matrix with a realization [η]
,
f_0
is a dissipation parameter such that the transient response of the ISDE are rapidly killed.
The ISDE has a unique invariant measure and a unique solution
that is a second-order diffusion stochastic process, which is stationary and ergodic,
and such that the probability distribution of random matrix [U]
is p_{[H]}(η)
;
Parameters: (ε, κ, m, f_0, Δt, M_0)
(or replace m
with ε_0
);
(TODO: for discrete distributions)