Idea: sample mean converges to population mean, as sample size goes to infinity.
Version 1: For a sequence of uncorrelated random variables, if they have the same expectation while the supremum of variances is not growing too fast, then their average converges in probability to the expectation. Symbolically, \( \forall X_i, X_j \in \{ X_n \}, i \neq j, \mathbb{E} X_i X_j = 0, \mathbb{E} X_i = \mathbb{E} X_j = \mu \), and \( \sup_{i<n} \mathrm{Var} X_i = o(n) \), then \[ \frac{1}{n} \sum_i X_i \overset{p}{\to} \mu \]
Version 2: For a random sample from a population with finite population mean, the sample mean converges in probability to the population mean. Symbolically, \( { X_i } \text{ i.i.d. } X \), \( \mathbb{E}X < \infty \), then \[ \frac{1}{n} \sum_i X_i \overset{p}{\to} \mathbb{E}X \]
For a random sample from a population with finite population mean, the sample mean converges almost surely to the population mean. Symbolically, \( { X_i } \text{ i.i.d. } X \), \( \mathbb{E}X < \infty \), then \[ \frac{1}{n} \sum_i X_i \overset{a.s.}{\to} \mathbb{E}X \]
Idea 1: the distribution of sample mean is asymptotically Gaussian, with variance equal to population variance.
Idea 2: In many cases, the sum of (centered) independent random variables is distributed approximately Gaussian. The necessary and sufficient conditions are:
Because of CLT, Gaussian random variables are often used to approximate a finite sum of random variables. But with current computing capacity, the importance of approximations like CLT is somewhat lessened.
\( { X_i } \) is a random sample from a population with finite population mean \( \mathbb{E}X \) and variance \( \mathrm{Var}X \), then \[ \sqrt{n} \left( \frac{1}{n} \sum_i X_i - \mathbb{E}X \right) \Rightarrow N(0, \mathrm{Var}X ) \]
\( { X_i } \) is a sequence of independent random variables in \( L^2 (\Omega, \Sigma, P) \), if the Lindeberg condition holds: \[ \lim_{n \to \infty} \frac{1}{s_n^2}\sum_{i = 1}^{n} \mathbb{E}\big[(X_i - \mu_i)^2 \cdot \mathbf{1}_{{ | X_i - \mu_i | > \varepsilon s_n }} \big] = 0 \] , where \( s_n^2 = \sum_i \sigma_i^2 \). Then \[ \sqrt{n} \left( \frac{1}{n} \sum_i ( X_i - \mathbb{E}X_i ) \right) \Rightarrow N(0, s_n^2 ) \]
\( { X_i } \) is a random sample from a population with finite \( L^3 \)-norm, then \[ \sup_{Z\in\mathbb{R}} \lvert F_{Z_n}(z) - F_{Z}(z) \rvert \leq \frac{c}{\sqrt{n}} \frac{ \lVert X-\mu \rVert_3 }{\sigma^3} \] , where \( Z_n = \sqrt{n} \frac{\bar{X}_n-\mu}{\sigma} \) and \( Z \sim N(0,1) \). \(c\) is a constant and \( c \in [\tfrac{1}{\sqrt{2\pi}}, 0.8) \)
The Berry-Essen CLT provides the accuracy of approximation.
Idea: Continuous mapping, on support of the stochastic limit, preserves convergence of random variables and distributions.
Thm: \( h(\cdot) \) is a continuous function on the support of X. If \( X_i \overset{p}{\to} X \), then \( h(X_i) \overset{p}{\to} h(X) \)
The same theorem also holds for almost sure convergence, convergence in \(L^2\), and convergence of distribution functions.
Slutsky's Theorem is an important corollary of CMT:
If \( X_i \Rightarrow X \), and \( Y_i \overset{p}{\to} a \), then
Idea: Continuously differentiable function preserves asymptotic distribution.
Thm: Given \( \mathbf{X}_n \overset{p}{\to} \mathbf{b} \) and \( a_n ( \mathbf{X}_n - \mathbf{b} ) \Rightarrow \mathbf{X} \). If \( g: \mathbb{R}^d \to \mathbb{R}^r \) is continuously differentiable at \( \mathbf{b} \), then \[ a_n [ g(\mathbf{X}_n) - g(\mathbf{b}) ] \Rightarrow (g\nabla)(\mathbf{b}) \mathbf{X} \]