Microeconometrics {Cameron2005}
Online Resources on Cameron:2005
ECON 615 Final Cheatsheet
Model Specification
Parametric Methods
- Linear models (OLS, WLS, IV)
- Maximum likelihood (ML) and nonlinear least-square (NLS) estimation
- Generalized method of moments
Semiparametric Methods
- Least absolute deviation (LAD) estimator
- Maximum score (MS) estimator; (Manski, 1975)
- Smoothed maximum score estimator; (Horowitz, 1992)
- Censored LAD estimator; (Powell, 1981, 1983)
- Symmetrically censored least square (SCLS) estimator; (Powell, 1986)
- Partially Linear Model
- Robinson Difference Estimator (Robinson, 1988) (9.7.3) (16.5)
- Single Index Models (9.7.4)
- Generalized Additive Models
Nonparametric Methods
- Kernel Density Estimation
- Conditional Density Estimation
- Nonparametric Regression
Models for Cross-Section Data
Discrete Outcome/Choice Models
Binary outcome models
- MLEs as latent variable models
- Linear probability model
- Logit (logistic regression) model
- Probit model
- Grouped and Aggregated Data: Minimum chi-square estimator
Multinomial outcome models
- Unordered outcomes
- conditional logit (CL), multinomial logit (MNL)
- Independence of Irrelevant Alternatives
- nested logit (NL)?, three-level nested logit
- multinomial probit (MNP)
- Ordered outcomes, and sequential decision
- ordered logit, ordered probit
- Specification test
- Likelihood ratio (LR)
- Hausman test
- Hausman-McFadden test
- Small-Hsiao pseudo-likelihood ratio test
Choice-based sampling: weighted MLE
Model selection
- AIC (Akaike information criterion)
- BIC (Bayesian information criterion)
Sample Selection Models
Sample Selection Models
- Tobit model (Censored model)
- MLE
- Two-step estimator (Ahn and Powell, 1993)
- Bivariate Sample Selection Model (Type 2 Tobit Model)
- Heckman two-step estimator (Heckman, 1979)
- Roy models (Type 5 Tobit Model)
Simultaneous equations models
- simultaneous equations Tobit model
- simultaneous equations Probit model
- coherency condition
Specification analysis
- Heteroscedasticity, serial correlations, and nonnormality
- Nelson test, 1981
- Hausman test
Duration and Count Data Models
Duration Regression Models:
- Proportional Hazard
- Left Censoring
- Markov Chain Models
Count Data Models:
- Poisson and Negative Binomial Models
- Simulated Maximum Likelihood (SML)
Count data models are suitable for samples taking non-negative integer values not much greater than zero.
Such models are consistent with the bound and discreteness of the variable, while not adding too much complexity.
In case of over-dispersion (variance-to-mean ratio greater than 1), negative binomial and quasi-Poisson distributions are considered as alternatives to Poisson distribution, with variance increasing quadratically and linearly in expectation respectively.
Binomial distribution is considered when the variable is also bounded above.
Models for Panel Data
Short Panels
Multilevel models (hierarchical (linear) models, nested models) are statistical models with coefficients organized at more than one level (group).
Depending on the model coefficients (effects), multilevel models can be classified into: random effects models (variance components models), fixed effects models, and mixed models.
Random effects are estimated with partial pooling, shrinkage (linear unbiased prediction);
Fixed effects are estimated using least squares or maximum likelihood.
If a statistical model contains both fixed effects and random effects, it is called a mixed model.
Fixed effects are constant across individuals, and random effects vary.
Fixed effects are treated as nuisance parameters, while estimation of marginal effects are of sole interest.
Static Panel Data Models
Simple Regression Models with Variable Intercepts:
(dep = time-inv + time-var + individual + error)
\[ y_{it} = z_i α + x_{it} β + u_i + ε_{it} \]
- Pooled Model: disregard time periods
- Pooled OLS estimator: OLS over panel
- Individual-specific Effects Model
- Random Effects (RE) Model (random intercept/components model; equicorrelated model): individual-specific effect uncorrelated with regressors, \( \text(corr)(X_{it}, u_i) = 0 \).
- Between Estimator: OLS over individual time-averages.
- Random Effects Estimator: Feasible GLS over panel; MLE.
- Fixed Effects (FE) Model: individual-specific effect correlates with regressors, \( \text(corr)(X_{it}, u_i) \ne 0 \).
- Within Estimator (Fixed Effects Estimator; Lease-squares Dummy-variable (LSDV) Estimator; Covariance Estimator): OLS over panel after subtracting individual time-averages.
- First Differences (FD) Estimator: OLS over panel after first-differences in time.
- Specification Analysis
- Individual-specific effect
Dynamic Panel Data Models
Dynamic Models with Variable Intercepts: (AR(1))
(dep = time-inv + time-var + lag_dep + individual + error)
\[ y_it = z_i α + x_it β + γ y_it-1 + u_i + ε_it \]
- Random Effects Models
- General FGLS Estimator [1. OLS residuals for error covariance matrix estimation; 2. FGLS]
- ML Estimator
- GMM Estimator (IV estimator)
- Fixed Effects Model
- GMM Estimator (IV estimator)
- General FGLS
- Fixed Effect GLS Estimator (FEGLS)
- First-difference GLS (FDGLS)
- Transformed ML Estimator
GMM estimator.
y ~ covariates | gmm instruments | 'normal' instruments
By default, all the variables of the model which are not used as GMM instruments are used as normal instruments with the same lag structure as the one specified in the model.
Transformation: difference GMM; system GMM.
General FGLS is based on a two-step estimation process: first a model is estimated by OLS (pooling), fixed effects (within) or first differences (fd), then its residuals are used to estimate an error covariance matrix for use in a feasible-GLS analysis.
This framework allows the error covariance structure inside every group (individual time series) to be fully unrestricted and is therefore robust against any type of intragroup heteroskedasticity and serial correlation.
Conversely, this structure is assumed identical across groups and thus general FGLS estimation is inefficient under groupwise heteroskedasticity.
Efficiency requires N >> T.
first difference with IV estimator
Pooled OLS is biased upward and is inconsistent.
GLS and ML estimators are also generally biased.
Within estimator is biased, because eliminating the individual effect causes a correlation between the transformed error term and the transformed lagged dependent variable.
Complication: Limited Dependent Variable
For FE models:
- Qualitative Choice Models (Discrete Data):
- Incidental Parameters Problem
- Conditional MLE,
-
Sample Selection Models (Censored and Trancated Data):
- Trimmed LS estimator for FE model.
(Honore, 1992)
- Bias-Adjusted Maximum Simulated Likelihood (MSL) Estimator [12.4]
- Auxiliary Models [12.6]
Complication: Cross-Sectionally Dependent Panel Data
- Spatial Approach,
- Factor Approach,
- Cross-sectional Mean Augmented Approach,
- Test of Cross-Sectional Independence
General FGLS is inefficient under cross-sectional correlation.
Panel Data Approach for Program Evaluation
- Selection on observables and unobservables;
- Propensity Score Matching Estimator (Rosenbaum and Rubin)
- Other estimators
- Differences-in-differences estimator;
- IV estimator for local average treatment effect (LATE); (under selection on unobservables)
- (Control function estimator)
- Regression discontinuity (RD) design;
🏷 Category=Economics