Method of Moments

Let $Pf = \mathbb{E}f$ denote the integration of random variable $f$ under probability measure $P$.

Empirical Distribution

Empirical measure (empirical probability density function, EPDF) is the discrete uniform measure on a random sample; that is, each observation of the sample is assigned the same finite probability. Empirical measure is a random measure because it is dependent on a random sample. Symbolically, the empirical measure $P_n$ of a measurable set $A$ in sample space $(Ω,Σ)$ is defined as \[ P_n(A) = \frac{1}{n} \sum_{i=1}^n \mathbf{1}_A(X_i) \]

The empirical distribution function (ECDF) of a real-valued random variable is the empirical measure indexed by a class of one-sided intervals: \[ P_n(x) = P_n (-\infty,x] = \frac{1}{n} \sum_{i=1}^n \mathbf{1}(X_i \leq x) \]

Empirical distribution functions have the following properties (point-wise in the parameter domain):

Unbiased: $ P P_n(x) = F(x) $
Consistent: $ P_n(x) \overset{p}{\to} F(x) $
Asymptotically normal: $ \sqrt{n} ( P_n(x) - F(x) ) \Rightarrow N(0, F(x)(1-F(x)) ) $

Theorem: (Glivenko-Cantelli) Empirical distribution function converges uniformly to the population distribution function: \[ \| P_n - F \|_{\infty} \overset{a.s.}{\to} 0 \]

Method-of-Moments Estimators

For a parameter $\mathbf{h}(P \mathbf{g}(X))$ that is a function of expectations, its method-of-moments estimator (substitution estimator) [@Pearson1894] replaces the population measure with the empirical measure: $ \mathbf{h}(P_n \mathbf{g}(X)) $. If the parameter to be estimated is a function of moments, $h(PX, \cdots, PX^k)$, the method-of-moments estimator is $ h( P_n X, \cdots, P_n X^k) $.

Empirical moments are consistent and asymptotically normal: using central limit theorem, we can show that \[ \sqrt{n} [ (P_n X, \cdots, P_n X^k) - (P X, \cdots, P X^k) ] \Rightarrow N(0,Σ) \], where $ Σ_{ij} = P X^{i+j} - P X^i P X^j $. If $h(\cdot)$ is continuously differentiable at $(PX, \cdots, PX^k)$, by the Delta method, we have \[ \sqrt{n} [ h( P_n X, \cdots, P_n X^k) - h(PX, \cdots, PX^k) ] \Rightarrow N(0, (∇h)'Σ(∇h)) \], where $∇h$ is evaluated at $(PX, \cdots, PX^k)$.

For Gamma distribution $Γ(α, β)$:

The parameters $(α, β) = g(PX, PX^2) $, where $g(x,y) = ( x^2/(y-x^2), (y-x^2)/x )$.
The $q$-th order moment is $ PX^q = α (α+1) \cdots (α+q-1) β^q $.
The asymptotic variance of empirical 1st and 2nd moments is $ Σ = \begin{pmatrix} α β^2 & 2 α(α+1)β^2 \\ 2 α(α+1)β^2 & 2 α(α+1)(2α+3)β^4 \end{pmatrix} $
Gradient of $g(x,y)$ at $(PX, PX^2)$ is $ ∇ g(PX, PX^2) = \begin{pmatrix} \frac{2(α+1)}{β} & -\frac{α+1}{α} \\ -\frac{1}{β^2} & \frac{1}{αβ} \end{pmatrix} $
Then we have the asymptotic variance for method of moments estimator $ g(P_n X, P_n X^2) $: $ Σ' = \begin{pmatrix} 2α(α+1) & -2(α+1)β \\ -2(α+1)β & \frac{(2α+3)β^2}{α} \end{pmatrix} $

For uniform distribution $U(0, θ)$, because $θ = 2 PX$, its method of moments estimator is $2 P_n X$. But this estimator is not very efficient.

🏷 Category=Statistics