Asymptotic analysis applies when sample size is large, and results are limited to statistics that are analytically tractable. {Efron1979} introduced the bootstrap as an empirical alternative to asymptotic analysis, which estimates the sampling distribution of a statistic by resampling the original sample. Reference books include {Efron and Tibsharani 1993}, and {Davison and Hinkley 1997}.

Sample size, \(N\); Bootstrap repetition, \(B\); Coefficient of interest, \(\beta_1\); Null hypothesis, \(H_0: \beta_1 = \beta_1^0 \); Estimator, \(\hat{\beta}\); Bootstrap estimator, \( \hat{\beta}_{1b}^* \), which may be different from the standard estimator; Restricted estimator, \(\hat{\beta}^R\); Residuals \(\mathbf{u}\); Estimator of standard error, \(s\); Wald test statistic, \(w = (\hat{\beta}_1 - \beta_1^0) / s_{\hat{\beta}_1}\); Significance level, \(\alpha\);

Implementation

Bootstrap can be implemented with specific choices of resampling method and inference procedure.

Bootstrap resampling methods

  1. Pairs bootstrap (case bootstrap, non-parametric bootstrap): regressors and regressant are always paired together;
  2. Residual bootstrap: regressant is constructed from randomized sample residuals;
  3. Wild bootstrap: regressant is constructed by flipping the sign of sample residual with equal probability (Rademacher weights);

Residual bootstrap assumes iid residuals; Wild bootstrap is applicable to heteroskedastic models;

Residual and wild bootstraps can impose the null hypothesis in resampling {Davidson and MacKinnon 1999}, where the bootstrap Wald statistics are centered on \(\beta_1^0\) and the residuals bootstrapped are those from the restricted OLS estimator that imposes \(H_0\).

Bootstrap inference procedures

  1. Bootstrap-t (percentile-t) {Efron1981}: use OLS estimates of the standard error of the sample and resamples, reject by bootstrap distribution;
  2. Bootstrap-se (standard error): use bootstrap estimate of the standard error \(\hat{\sigma}_{\hat{\beta}_1} = s_{\hat{\beta}_{1B}^* }\), reject by normal distribution;

Asymptotic refinement refers to a convergence rate faster that using first-order asymptotic theory. To have asymptotic refinement, a bootstrap needs to be applied to an asymptotically pivotal statistic, a statistic whose asymptotic distribution does not contain unknown parameters. Bootstrap-t procedures provide asymptotic refinement, while bootstrap-se procedures do not.

Clustered Data

A sample may contain clusters of observational units such that regression errors of the observations are independent across clusters but correlated within. Such correlation structures effectively reduce sample size to the number of clusters in statistical inference, where errors are assumed to be independent across observations. See {Cameron2015} for a good review on inference with clustered data.

Number of clusters in sample, \(G\); Number of observations in cluster \(g\), \(N_g\); Subsample of cluster \(g\), \( (y_g, X_g) \); Covariance matrix of regression errors within cluster \(g\), \(\Sigma_g\); Individual \(i\) in cluster \(g\) have subscript \(ig\);

Covariance matrix of the OLS estimator on clustered data is: \[ \text{Var}(\hat{\boldsymbol{\beta}} \mid \textbf{X}) = (X' X)^{-1} \left( \sum_{g = 1}^G X_g' \Sigma_g X_g \right) (X' X)^{-1}\]

Cluster-robust variance estimator (CRVE), \( \widehat{\text{Var}}_\text{CR}(\hat{\beta} ) \), replaces \(\Sigma_g\) with sample estimate \( \tilde{u_g}' \tilde{u_g} \). Here \(\tilde{u}\) is corrected residuals, and the standard CRVE simply uses the OLS residuals. {Bell and McCaffrey 2002} proposed a correction \( \tilde{u} = u_g \sqrt{G / (G-1)} \), which generalizes the HC3 measure in {MacKinnon and White 1985} and is equivalent to the jackknife estimator of \(\text{Var}(\hat{\boldsymbol{\beta}} \mid \textbf{X})\). {Cameron2008} referred to this correction as CR3.

Resampling methods:

  1. Pairs cluster bootstrap
  2. Residual cluster bootstrap
  3. Wild cluster bootstrap

Residual cluster bootstrap requires balanced clusters.