Asymptotic analysis applies when sample size is large, and results are limited to statistics that are analytically tractable. {Efron1979} introduced the bootstrap as an empirical alternative to asymptotic analysis, estimating the sampling distribution of a statistic by resampling the original sample with replacement. Reference books include {Efron and Tibsharani, 1993}, and {Davison and Hinkley, 1997}.

Bootstrapping may also be used for constructing hypothesis tests, regression.

Notations: Sample size, \(N\); Bootstrap repetition, \(B\); Coefficient of interest, \(\beta_1\); Null hypothesis, \(H_0: \beta_1 = \beta_1^0 \); Estimator, \(\hat{\beta}\); Bootstrap estimator, \( \hat{\beta}_{1b}^* \), which may be different from the standard estimator; Restricted estimator, \(\hat{\beta}^R\); Residuals \(\mathbf{u}\); Estimator of standard error, \(s\); Wald test statistic, \(w = (\hat{\beta}_1 - \beta_1^0) / s_{\hat{\beta}_1}\); Significance level, \(\alpha\);

Implementation

Bootstrap can be implemented with specific choices of resampling method and inference procedure.

Bootstrap resampling methods in regression

  1. Pairs bootstrap (case bootstrap, non-parametric bootstrap): regressors and regressant are always paired together;
  2. Residual bootstrap: regressant is constructed from randomized sample residuals (assumes IID residuals);
  3. Wild bootstrap: regressant is constructed by flipping the sign of sample residual with equal probability (Rademacher weights); (applicable to heteroskedastic models)

Residual and wild bootstraps can impose the null hypothesis in resampling {Davidson and MacKinnon, 1999}, where the bootstrap Wald statistics are centered on \(\beta_1^0\) and the residuals bootstrapped are those from the restricted OLS estimator that imposes \(H_0\).

Bootstrap inference procedures in hypothesis testing

  1. Bootstrap-t (percentile-t) {Efron1981}: use OLS estimates of the standard error of the sample and resamples, reject by bootstrap distribution;
  2. Bootstrap-se (standard error): use bootstrap estimate of the standard error \(\hat{\sigma}_{\hat{\beta}_1} = s_{\hat{\beta}_{1B}^* }\), reject by normal distribution;

Asymptotic refinement refers to a convergence rate faster that using first-order asymptotic theory. To have asymptotic refinement, a bootstrap needs to be applied to an asymptotically pivotal statistic, a statistic whose asymptotic distribution does not contain unknown parameters. Bootstrap-t procedures provide asymptotic refinement, while bootstrap-se procedures do not.

Clustered Data

A sample may contain clusters of observational units such that regression errors of the observations are independent across clusters but correlated within. Such correlation structures effectively reduce sample size to the number of clusters in statistical inference, where errors are assumed to be independent across observations. See {Cameron2015} for a good review on inference with clustered data.

Number of clusters in sample, \(G\); Number of observations in cluster \(g\), \(N_g\); Subsample of cluster \(g\), \( (y_g, X_g) \); Covariance matrix of regression errors within cluster \(g\), \(\Sigma_g\); Individual \(i\) in cluster \(g\) have subscript \(ig\);

Covariance matrix of the OLS estimator on clustered data is: \[ \text{Var}(\hat{\boldsymbol{\beta}} \mid \textbf{X}) = (X' X)^{-1} \left( \sum_{g = 1}^G X_g' \Sigma_g X_g \right) (X' X)^{-1}\]

Cluster-robust variance estimator (CRVE), \( \widehat{\text{Var}}_\text{CR}(\hat{\beta} ) \), replaces \(\Sigma_g\) with sample estimate \( \tilde{u_g}' \tilde{u_g} \). Here \(\tilde{u}\) is corrected residuals, and the standard CRVE simply uses the OLS residuals. {Bell and McCaffrey, 2002} proposed a correction \( \tilde{u} = u_g \sqrt{G / (G-1)} \), which generalizes the HC3 measure in {MacKinnon and White, 1985} and is equivalent to the jackknife estimator of \(\text{Var}(\hat{\boldsymbol{\beta}} \mid \textbf{X})\). {Cameron2008} referred to this correction as CR3.

Resampling methods:

  1. Pairs cluster bootstrap
  2. Residual cluster bootstrap
  3. Wild cluster bootstrap

Residual cluster bootstrap requires balanced clusters.