Heavy Tailed Distribution

Tail distribution function:

\[ \overline{F}(x) \equiv \Pr[X>x] = 1 - F(x) \]

Heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:

\[ \forall \lambda>0,\quad \lim_{x \to \infty} e^{\lambda x} \overline{F}(x) = \infty \]

There are three important subclasses of heavy-tailed distributions:

Fat-tailed distribution: \[ \exists \alpha, c > 0,\quad \lim_{x \to \infty} x^{\alpha} \overline{F}(x) = c \]
Subexponential distribution: ("catastrophe principle") \[ \forall n \in \mathbb{Z}^+, \lim_{x \to \infty} \overline{F}_{X_{(n)}}(x) / \overline{F}_{\sum X_i}(x) = 1 \]
Long-tailed distribution: \[ \forall t>0, \lim_{x \to \infty} \overline{F}(x+t) / \overline{F}(x) = 1 \]

All subexponential distributions are long-tailed.

Examples of heavy-tailed distributions:

Fat-tailed: Pareto, log-logistic;
Subexponential: log-normal, Weibull;
Long-tailed distribution: ;

Zipf, Cauchy, Student's t, Frechet

Properties

Scale invariance

Distribution \(F\) is scale invariant if:

\[ \exists x_0, g: \forall \lambda, x, \lambda x \ge x_0, \overline{F}(\lambda x) = g(\lambda) \overline{F}(x) \]

Theorem: A distribution is scale invariant if and only if it is Pareto.

Distribution \(F\) is asymptotically scale invariant if:

\[ \exists g \in C^0: \forall \lambda > 0, \lim_{x \to +\infty} \overline{F}(\lambda x) / \overline{F}(x) = g(\lambda) \]

Function \(L\) is slowly varying if:

\[ \exists L: \forall y > 0, \lim_{x \to +\infty} L(xy) / L(x) = 1 \]

Distribution \(F\) is regularly varying if exists a slowly varying function \(L\) such that:

\[ \overline{F}(x) = x^{-\rho} L(x) \]

Theorem: A distribution is asymptotically scale invariant if and only if it is regularly varying.

Regularly varying distributions basically behave like Pareto distributions with respect to the tail.

The "catastrophe principle"

The "catastrophe principle": the principle of a single big jump.

The "conspiracy principle":

\[ \forall n \in \mathbb{Z}^+, \lim_{x \to \infty} \overline{F}_{X_{(n)}}(x) / \overline{F}_{\sum X_i}(x) = 0 \]

Subexponential distributions by definition satisfies the "catastrophe principle".

The residual life "blows up"

The distribution of residual life given the current life \( x \) is:

\[ \overline{R}_x(t) = \overline{F}(x + t) / \overline{F}(x) \]

The distribution of residual life of an exponential distribution does not depend on the current life, known as memoryless.

For Pareto distribution, the distribution of residual life increases with the current life:

\[ \overline{R}_x(t) = \dfrac{(x_0 / (x+t))^{\alpha}}{(x_0 / x)^{\alpha}} = (1+t/x)^{-\alpha},\quad \alpha > 0 \]

Mean residual life \( m(x) = - \int_{\mathbb{R}_+} t~\mathrm{d} \overline{R}_x(t) \).

Hazard rate \( q(x) \equiv f(x) / \overline{F}(x) = - \overline{R}'_x(0) \)

Long-tailed distributions have decreasing hazard rates and increasing mean residual lives (unbounded).

Emergence

Additive Processes

When the population in Generalized Central Limit Theorem does not have finite variance, the limiting processing converges to stable distribution with stability parameter \( 0 < \alpha < 2 \), all of which are heavy tailed. See Limit Theorems.

Ruin time \( T \equiv \inf \{t > 0 | x + c t - \sum_{i=1}^t X_i < 0 \} \) is always heavy-tailed. In case of symmetric 1D random walk, \( X \sim 2 * \text{Bernoulli}(0.5) - 1 \), the distribution of ruin time \( \overline{R}_T(t) \sim \sqrt{2/(\pi x)} \).

Multiplicative Processes

Multiplicative processes almost always lead to heavy tails: wealth, twitter followers, hyperlink.

For a population \( Y \) with \( \mu = \mathbb{E} \log Y \) and \( \text{Var} (\log Y) = \sigma^2 < \infty \), classical central limit theorem gives:

\[ \left( \prod_i \frac{Y_i}{e^\mu} \right)^{1/\sqrt{n}} \to \text{LogNormal}(0, \sigma^2) \]

Multiplicative process with a lower barrier \( P_n = \min(P_{n-1} Y_n, \varepsilon) \), under minor technical conditions, \( P_n \to F \) which is "nearly" regularly varying:

\[ \lim_{x \to \infty} \dfrac{\log \overline{F}(x)}{\log x} = \sup\{ s \ge 0 | \mathbb{E}Y_1^s \le 1 \} \]

Multiplicative process with noise \( P_n = P_{n-1} Y_n + Q_n \) also leads to distributions that are approximately power-law.

Extremal Processes

See Statistics of Extremes

The time until a record

Identification

Resources

Jayakrishnan Nair, Adam Wierman, Bert Zwart. The Fundamentals of Heavy-Tails: Properties, Emergence, and Identification.

🏷 Category=Probability