Inequalities

Jensen's Inequality

For any r.v. X with finite expectation and any convex function \( \varphi(\cdot) \),

\[ \mathbb{E} \varphi(X) \geq \varphi( \mathbb{E} X ) \]

Note:

  1. Equality holds iff \( \varphi(\cdot) \) agrees with a linear funtion on the support of X.
  2. If \( \varphi(\cdot) \) is strictly convex, the inequality is strict.

Corollary 1: Arithmetic-geometric Mean Inequality

If \( p_1, \cdots, p_n \geq 0 \) and \( p_1 + \cdots + p_n = 1 \), then

\[ \sum_{i=1}^n p_i a_i \geq \prod_{i=1}^n a_i^{p_i} \]

A special case of arithmetic-geometric mean inequality is Young's inequality:

\[ x,y \geq 0, p,q>0, \frac{1}{p} + \frac{1}{q} = 1 \Rightarrow xy \leq \frac{x^p}{p} + \frac{y^q}{q} \]

Corollary 2: Likelihood Inequality

If \( X \sim p(x) \) and \( q(x) \) is another density function, then

\[ \mathbb{E} \log p(X) \geq \mathbb{E} \log q(X) ) \]

Equality holds iff \( p(x) = q(x) \).

Covariance Inequality

For all \( X, g(\cdot), h(\cdot) \) s.t. \( \mathbb{E} g(X), \mathbb{E} h(X), \mathbb{E} g(X)h(X) \) exist:

  1. If \( g(\cdot) \) is nondecreasing and \( h(\cdot) \) is nonincreasing, then \( \text{Cov}[g(X), h(X)] \leq 0 \)
  2. If \( g(\cdot) \) and \( h(\cdot) \) are both nondecreasing/nonincreasing, then \( \text{Cov}[g(X), h(X)] \geq 0 \)

Holder's Inequality

If \( p,q>0, \frac{1}{p} + \frac{1}{q} = 1 \), then for all r.v. X and Y, given the expectations exist,

\[ \lvert \mathbb{E} XY \rvert \leq \left( \mathbb{E} \lvert X \rvert^p \right)^{\frac{1}{p}} \left( \mathbb{E} \lvert Y \rvert^q \right)^{\frac{1}{q}} \]

Derivation: Using Young's inequality, \( \frac{ \lvert X \rvert }{ \left( \mathbb{E} \lvert X \rvert^p \right)^{\frac{1}{p}} } \frac{ \lvert Y \rvert }{ \left( \mathbb{E} \lvert Y \rvert^q \right)^{\frac{1}{q}} } \leq \frac{ \lvert X \rvert^p }{ p \mathbb{E} \lvert X \rvert^p } + \frac{ \lvert Y \rvert^q }{ q \mathbb{E} \lvert Y \rvert^q } \). Holder's inequality can be derived by taking expectation on both sides.

Corollary: Cauchy-Schwarz Inequality

For all r.v. X and Y, given the expectations exist,

\[ \lvert \mathbb{E} XY \rvert \leq \left( \mathbb{E} \lvert X \rvert^2 \right)^{\frac{1}{2}} \left( \mathbb{E} \lvert Y \rvert^2 \right)^{\frac{1}{2}} \]

Chebyshev's Inequality

For any positive r.v. X and \( r>0 \),

\[ P(X \geq r) \leq \frac{\mathbb{E} X}{r} \]

Corollary:

  1. For all positive, increasing function \( g(\cdot) \), in addition, \( P(X \geq r) \leq \frac{\mathbb{E} g(X)}{g(r)} \).
  2. In particular, for any r.v. X and \( p>0 \), \( P( \lvert X \rvert \geq r) \leq \frac{\mathbb{E} \lvert X \rvert^p }{r^p} \).
  3. A commonly cited version is that no more than \(1/k^2\) of a distribution's values can be more than k standard deviations away from the mean: \( P(|X-\mu| \ge r) \leq \frac{\sigma^2}{r^2}. \)

Bernstein inequality (exponential version for bounded r.v.'s): Suppose that \(|X| \le M\) almost surely, then for all \(\epsilon > 0\)

\[ P(|\bar{X} - \mu| \ge \epsilon ) \le 2\exp \left( -\frac{n\epsilon^2/2}{\sigma^2 + M\epsilon/3} \right) \]

Identities

Some properties of Gamma, Chi-squared, Poisson and negative binomial distribution.