Bootstrap

Asymptotic analysis applies when sample size is large, and results are limited to statistics that are analytically tractable. {Efron1979} introduced the bootstrap as an empirical alternative to asymptotic analysis, which uses resamples from a single sample to estimate the probabilistic distribution of a statistic, especially its standard error. Reference books include {Efron and Tibsharani 1993}, and {Davison and Hinkley 1997}.

Sample size, \(N\); Bootstrap repetition, \(B\); Coefficient of interest, \(\beta_1\); Null hypothesis, \(H_0: \beta_1 = \beta_1^0 \); Estimator, \(\hat{\beta}\); Bootstrap estimator, \( \hat{\beta}_{1b}^* \), which may be different from the standard estimator; Restricted estimator, \(\hat{\beta}^R\); Residuals \(\mathbf{u}\); Standard error, \(s\); Wald test, \(w = (\hat{\beta}_1 - \beta_1^0) / s_{\hat{\beta}_1}\); Significance level, \(\alpha\);

Implementation

Bootstrap can be implemented with specific choices of resampling method and inference procedure.

Bootstrap resampling methods

Pairs bootstrap (case bootstrap, non-parametric bootstrap): regressors and regressant are always paired together;
Residual bootstrap: regressant is constructed from randomized sample residuals;
Wild bootstrap: regressant is constructed by flipping the sign of sample residual with equal probability (Rademacher weights);

Residual bootstrap assumes iid residuals; Wild bootstrap is applicable to heteroskedastic models;

Bootstrap inference procedures

Bootstrap-t (percentile-t) {Efron1981}: use OLS estimates of the standard error of the sample and resamples, reject by bootstrap distribution;
Bootstrap-se: use bootstrap estimate of the standard error, \(s_{\hat{\beta}_{1,B}} = \text{se}(\hat{\beta}_{1b}^* )\), reject by normal distribution;

Imposing the null hypothesis: restricted OLS estimator and residuals;

Asymptotic refinement: Asymptotically pivotal statistic has faster convergence rate relative to using first-order asymptotic theory.

Clustered Data

Observational units are grouped in a way such that errors are independent across clusters but correlated within.

Number of clusters in sample, \(G\); Number of observations in cluster \(g\), \(N_g\); Subsample of cluster \(g\), \( (y_g, X_g) \); Covariance matrix of errors within cluster \(g\), \(\Sigma_g\); Individual \(i\) in cluster \(g\) have subscript \(ig\);

Covariance matrix of the OLS estimator on clustered data is: \[ \text{Var}(\hat{\boldsymbol{\beta}} \mid \textbf{X}) = (X' X)^{-1} \left( \sum_{g = 1}^G X_g' \Sigma_g X_g \right) (X' X)^{-1}\]

Cluster robust variance estimator (CRVE): Corrected residual, \(\tilde{\mathbf{u}}\);

Resampling methods:

Pairs cluster bootstrap
Residual cluster bootstrap
Wild cluster bootstrap