A sequence of estimators of model parameter based on random sample is consistent, if the estimators converge in probability to the parameter, pointwise in the parameter space.
Symbolically, \( W_n = W_n (\{ X_1, \cdots, X_n \}) \) is consistent of \( \boldsymbol{\theta} \), if
\[ W_n \overset{p}{\to} \boldsymbol{\theta}, \forall \boldsymbol{\theta} \in \boldsymbol{\Theta} \]
Thm: For a sequence of estimators of model parameter, if its expectation converges to the parameter and its variation converges to zero, both pointwise in the parameter space, then it is consistent.
Symbolically, \( W_n \) is consistent of \( \boldsymbol{\theta} \), if
\[ \mathbb{E}W_n \to \boldsymbol{\theta}, \mathrm{Var} W_n \to 0, \forall \boldsymbol{\theta} \in \boldsymbol{\Theta} \]
Example:
Thm: Given a consistent sequence of estimators of model parameter, any sequence of affine transformations that approaches identity also induces a consistent sequence of estimators.
Symbolically, \( W_n \) is consistent of \( \boldsymbol{\theta} \), \( a_n \to 1, b_n \to 0 \), then \( U_n = a_n W_n + b_n \) is also consistent of \( \boldsymbol{\theta} \).
The variance of the asymptotic distribution of an estimator is called the asymptotic variance of the estimator.
Note:
A sequence of estimators is called asymptotically efficient for an induced parameter, if it is asymptotically normal with asymptotic variance achieving the Cramer-Rao Lower Bound, pointwise in the parameter space.
Symbolically, an estimator \( W_n \) of induced parameter \( g(\boldsymbol{\theta}) \) is asymptotically efficient, if
\[ \sqrt{n} [W_n - g(\boldsymbol{\theta})] \Rightarrow N \left( 0, (g \nabla) [\mathbb{E}(\nabla \log f)(\nabla \log f)']^{-1} (\nabla g) \right) , \quad \forall \boldsymbol{\theta} \in \boldsymbol{\Theta} \]
For estimators that cannot achieve asymptotic efficiency, they may have other desirable properties such as ease of calculation, robustness to underlying assumptions. An index of relative efficiency then becomes useful to quantify the trade-off.
Given two asymptotically normal estimators of an induced parameter, the ratio of their asymptotic variances is called the asymptotic relative efficiency (ARE) of the latter with respect to the former.
Symbolically, given \( \sqrt{n} [V_n - g(\boldsymbol{\theta})] \Rightarrow N ( 0, \sigma^2_V ) \), and \( \sqrt{n} [W_n - g(\boldsymbol{\theta})] \Rightarrow N ( 0, \sigma^2_W ) \), the asymptotic relative efficiency of \( V_n \) with respect to \( W_n \) is
\[ \text{ARE}(V_n,W_n) = \frac{\sigma^2_W}{\sigma^2_V} \]
The term robustness can have many interpretations, but perhaps it is best summarized by Huber (1981, Sec. 1.2):
A statistic based on random sample has a breakdown value, if when any larger portion than the critical value of the sample diverges, the statistic would also diverge.
Symbolically, statistic \( T_n \) of a sample of size \(n\) has breakdown value \(b, b\in[0,1]\), if
\[ T_n < \infty \text{ as } X_{ ( \{ (1-b)n \} ) } \to \infty; \\ \forall \varepsilon >0, T_n \to \infty \text{ as } X_{ ( \{ (1-b-\varepsilon)n \} ) } \to \infty \]
Here, \( \{ \cdot \} \) denotes the nearest integer function.
Note: