Heavy Tailed Distribution

Tail distribution function:

\[ \overline{F}(x) \equiv \Pr[X>x] = 1 - F(x) \]

Heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:

\[ \forall \lambda>0,\quad \lim_{x \to \infty} e^{\lambda x} \overline{F}(x) = \infty \]

There are three important subclasses of heavy-tailed distributions:

Fat-tailed distribution: \[ \exists \alpha, c > 0,\quad \lim_{x \to \infty} x^{\alpha} \overline{F}(x) = c \]
Subexponential distribution: ("catastrophe principle") \[ \forall n \in \mathbb{Z}^+, \lim_{x \to \infty} \overline{F}_{X_{(n)}}(x) / \overline{F}_{\sum X_i}(x) = 1 \]
Long-tailed distribution: \[ \forall t>0, \lim_{x \to \infty} \overline{F}(x+t) / \overline{F}(x) = 1 \]

All subexponential distributions are long-tailed.

Examples of heavy-tailed distributions:

Fat-tailed: Pareto, log-logistic;
Subexponential: log-normal, Weibull;
Long-tailed distribution: ;

Zipf, Cauchy, Student's t, Frechet

Properties

Scale invariance

Distribution \(F\) is scale invariant if:

\[ \exists x_0, g: \forall \lambda, x, \lambda x \ge x_0, \overline{F}(\lambda x) = g(\lambda) \overline{F}(x) \]

Theorem: A distribution is scale invariant if and only if it is Pareto.

Distribution \(F\) is asymptotically scale invariant if:

\[ \exists g \in C^0: \forall \lambda > 0, \lim_{x \to +\infty} \overline{F}(\lambda x) / \overline{F}(x) = g(\lambda) \]

Function \(L\) is slowly varying if:

\[ \exists L: \forall y > 0, \lim_{x \to +\infty} L(xy) / L(x) = 1 \]

Distribution \(F\) is regularly varying if exists a slowly varying function \(L\) such that:

\[ \overline{F}(x) = x^{-\rho} L(x) \]

Theorem: A distribution is asymptotically scale invariant if and only if it is regularly varying.

Regularly varying distributions basically behave like Pareto distributions with respect to the tail.

The "catastrophe principle"

The "catastrophe principle": the principle of a single big jump.

The "conspiracy principle":

\[ \forall n \in \mathbb{Z}^+, \lim_{x \to \infty} \overline{F}_{X_{(n)}}(x) / \overline{F}_{\sum X_i}(x) = 0 \]

Subexponential distributions by definition satisfies the "catastrophe principle".

The residual life "blows up"

The distribution of residual life given the current life \( x \) is:

\[ \overline{R}_x(t) = \overline{F}(x + t) / \overline{F}(x) \]

The distribution of residual life of an exponential distribution does not depend on the current life, known as memoryless.

For Pareto distribution, the distribution of residual life increases with the current life:

\[ \overline{R}_x(t) = \dfrac{(x_0 / (x+t))^{\alpha}}{(x_0 / x)^{\alpha}} = (1+t/x)^{-\alpha},\quad \alpha > 0 \]

Mean residual life \( m(x) = - \int_{\mathbb{R}_+} t~\mathrm{d} \overline{R}_x(t) \).

Hazard rate \( q(x) \equiv f(x) / \overline{F}(x) = - \overline{R}'_x(0) \)

Long-tailed distributions have decreasing hazard rates and increasing mean residual lives (unbounded).

Emergence

Additive Processes

A distribution is stable if linear combinations of two independent random samples from the population has the same distribution, up to location and scale parameters:

A non-degenerate distribution \(X\) is a stable distribution if \[ X_1, X_2 \sim X, \forall a, b > 0, \exists c > 0, d \in \mathbb{R}: a X_1 + b X_2 \sim c X + d \] The distribution is strictly stable if \( d=0 \).

Stable distributions have characteristic function:

\[ \begin{aligned} \varphi(t; \alpha, \beta, c, \mu) &= \exp \{ i t \mu - |c t|^\alpha (1 - i \beta \text{sgn}(t) \Phi(\alpha, t) ) \} \\ \Phi(\alpha, t) &= \begin{cases} \tan(\pi \alpha / 2) & (\alpha \neq 1) \\ -2/\pi \log|t| & (\alpha = 1) \end{cases} \end{aligned} \]

Stable distributions form a four-parameter family, with stability parameter \(\alpha\), skewness parameter \( \beta\) (not the standardized 3rd moment), scale parameter \(c\), and location parameter \(\mu\).

The stability parameter takes value in \( (0, 2] \), and roughly corresponds to concentration:

\( \alpha = 2 \): normal distribution;
\( 0 < \alpha < 2 \): variance undefined;
\( 0 < \alpha \le 1 \): expectation undefined;

When the skewness parameter takes value in \( [-1, 1]) \), and roughly corresponds to symmetry:

\( \beta = 0 \): the distribution is symmetric about \(\mu\);
\( \beta = 1 \) and \( \alpha < 1 \): the distribution has support \( [\mu, +\infty) \);

Special cases:

\( \alpha = 1, \beta = 0 \): Cauchy distribution;
\( \alpha = 0.5, \beta = 1 \): Levy distribution;

Generalized Central Limit Theorem: {Gnedenko and Kolmogorov, 1954}

The sum of a number of random variables with probability density decreasing as power law \( |x|^{−\alpha − 1} \) where \( 0 < \alpha < 2 \) (and therefore having infinite variance) will tend to a stable distribution \( f(x; \alpha, 0, c, 0) \) as the number of summands grows.

Ruin time \( T \equiv \inf \{t > 0 | x + c t - \sum_{i=1}^t X_i < 0 \} \) is always heavy-tailed. In case of symmetric 1D random walk, \( X \sim 2 * \text{Bernoulli}(0.5) - 1 \), the distribution of ruin time \( \overline{R}_T(t) \sim \sqrt{2/(\pi x)} \).

Multiplicative Processes

Extremal Processes

The time until a record

Identification

Resources

Jayakrishnan Nair, Adam Wierman, Bert Zwart. The Fundamentals of Heavy-Tails: Properties, Emergence, and Identification.

🏷 Category=Probability