Jensen's Inequality

For any r.v. X with finite expectation and any convex function $\varphi(\cdot)$,

$$\mathbb{E} \varphi(X) \geq \varphi( \mathbb{E} X )$$


  1. Equality holds iff $\varphi(\cdot)$ agrees with a linear funtion on the support of X.
  2. If $\varphi(\cdot)$ is strictly convex, the inequality is strict.

Corollary 1: Arithmetic-geometric Mean Inequality

If $p_1, \cdots, p_n \geq 0$ and $p_1 + \cdots + p_n = 1$, then

$$\sum_{i=1}^n p_i a_i \geq \prod_{i=1}^n a_i^{p_i}$$

A special case of arithmetic-geometric mean inequality is Young's inequality:

$$x,y \geq 0, p,q>0, \frac{1}{p} + \frac{1}{q} = 1 \Rightarrow xy \leq \frac{x^p}{p} + \frac{y^q}{q}$$

Corollary 2: Likelihood Inequality

If $X \sim p(x)$ and $q(x)$ is another density function, then

$$\mathbb{E} \log p(X) \geq \mathbb{E} \log q(X) )$$

Equality holds iff $p(x) = q(x)$.

Covariance Inequality

For all $X, g(\cdot), h(\cdot)$ s.t. $\mathbb{E} g(X), \mathbb{E} h(X), \mathbb{E} g(X)h(X)$ exist:

  1. If $g(\cdot)$ is nondecreasing and $h(\cdot)$ is nonincreasing, then $\text{Cov}[g(X), h(X)] \leq 0$
  2. If $g(\cdot)$ and $h(\cdot)$ are both nondecreasing/nonincreasing, then $\text{Cov}[g(X), h(X)] \geq 0$

Holder's Inequality

If $p,q>0, \frac{1}{p} + \frac{1}{q} = 1$, then for all r.v. X and Y, given the expectations exist,

$$\lvert \mathbb{E} XY \rvert \leq \left( \mathbb{E} \lvert X \rvert^p \right)^{\frac{1}{p}} \left( \mathbb{E} \lvert Y \rvert^q \right)^{\frac{1}{q}}$$

Derivation: Using Young's inequality, $\frac{ \lvert X \rvert }{ \left( \mathbb{E} \lvert X \rvert^p \right)^{\frac{1}{p}} } \frac{ \lvert Y \rvert }{ \left( \mathbb{E} \lvert Y \rvert^q \right)^{\frac{1}{q}} } \leq \frac{ \lvert X \rvert^p }{ p \mathbb{E} \lvert X \rvert^p } + \frac{ \lvert Y \rvert^q }{ q \mathbb{E} \lvert Y \rvert^q }$. Holder's inequality can be derived by taking expectation on both sides.

Corollary: Cauchy-Schwarz Inequality

For all r.v. X and Y, given the expectations exist,

$$\lvert \mathbb{E} XY \rvert \leq \left( \mathbb{E} \lvert X \rvert^2 \right)^{\frac{1}{2}} \left( \mathbb{E} \lvert Y \rvert^2 \right)^{\frac{1}{2}}$$

Chebyshev's Inequality

For any positive r.v. X and $r>0$,

$$P(X \geq r) \leq \frac{\mathbb{E} X}{r}$$


  1. For all positive, increasing function $g(\cdot)$, in addition, $P(X \geq r) \leq \frac{\mathbb{E} g(X)}{g(r)}$.
  2. In particular, for any r.v. X and $p>0$, $P( \lvert X \rvert \geq r) \leq \frac{\mathbb{E} \lvert X \rvert^p }{r^p}$.
  3. A commonly cited version is that no more than $1/k^2$ of a distribution's values can be more than k standard deviations away from the mean: $P(|X-\mu| \ge r) \leq \frac{\sigma^2}{r^2}.$

Bernstein inequality (exponential version for bounded r.v.'s): Suppose that $|X| \le M$ almost surely, then for all $\epsilon > 0$

$$P(|\bar{X} - \mu| \ge \epsilon ) \le 2\exp \left( -\frac{n\epsilon^2/2}{\sigma^2 + M\epsilon/3} \right)$$

Concentration inequalities:

  • Markov's inequality;
  • Chebyshev's inequality;
  • Chernoff bounds;
  • Bounds on sums of independent variables: Hoeffding, Azuma, McDiarmid, Bennett, and Bernstein inequalities;
  • Efron–Stein inequality;
  • Dvoretzky–Kiefer–Wolfowitz inequality.


Some properties of Gamma, Chi-squared, Poisson and negative binomial distribution.

🏷 Category=Probability