Notes on @Cameron2005.
See also Microeconometrics and ECON 615 Final Cheatsheet.
Model Specification
Parametric Methods:
 Linear models (OLS, WLS, IV);
 Maximum likelihood (ML) and nonlinear leastsquare (NLS) estimation;
 Generalized method of moments (GMM);
Semiparametric Methods:
 Least absolute deviation (LAD) estimator;
 Maximum score (MS) estimator [@Manski1975];
 Smoothed maximum score estimator [@Horowitz1992];
 Censored LAD estimator [@Powell1981; @Powell1983];
 Symmetrically censored least square (SCLS) estimator [@Powell1986];
 Partially Linear Model;
 Single Index Models (9.7.4);
 Generalized Additive Models (GAM);
Nonparametric Methods:
 Kernel Density Estimation;
 Conditional Density Estimation;
 Nonparametric Regression;
Models for CrossSectional Data
Discrete Outcome/Choice
Binary outcome models:
 MLEs as latent variable models;
 Linear probabilistic model;
 Logit (logistic regression) model;
 Probit model;
 Grouped and Aggregated Data: Minimum chisquare estimator;
Multinomial outcome models:
 Unordered outcomes:
 conditional logit (CL), multinomial logit (MNL);
 Independence of Irrelevant Alternatives;
 nested logit (NL)?, threelevel nested logit;
 multinomial probit (MNP);
 Ordered outcomes, and sequential decision:
 ordered logit, ordered probit;
 Specification test:
 Likelihood ratio (LR);
 Hausman test;
 HausmanMcFadden test;
 SmallHsiao pseudolikelihood ratio test;
Choicebased sampling: weighted MLE;
Model selection:
 Akaike information criterion (AIC);
 Bayesian information criterion (BIC);
Sample Selection
Sample Selection Models:
 Tobit model (Censored model):
 MLE;
 Twostep estimator [Ahn and Powell, 1993];
 Bivariate Sample Selection Model (Type 2 Tobit Model):
 Heckman twostep estimator (Heckman, 1979);
 Roy models (Type 5 Tobit Model)
Simultaneous equations models:
 simultaneous equations Tobit model;
 simultaneous equations Probit model;
 coherency condition;
Specification analysis:
 Heteroscedasticity, serial correlations, and nonnormality;
 Nelson test, 1981;
 Hausman test;
Duration and Count
Duration Regression Models:
 Proportional Hazard;
 Left Censoring;
 Markov Chain Models;
Count Data Models:
 Poisson and Negative Binomial Models;
 Simulated Maximum Likelihood (SML);
Count data models are suitable for samples
taking nonnegative integer values not much greater than zero.
Such models are consistent with the bound and discreteness of the variable,
while not adding too much complexity.
In case of overdispersion (variancetomean ratio greater than one),
negative binomial and quasiPoisson distributions
are considered as alternatives to Poisson distribution,
with variance increasing quadratically and linearly in expectation respectively.
Binomial distribution is considered when the variable is also bounded above.
Models for Panel Data
Short Panels.
Multilevel models (hierarchical linear models, nested models)
are statistical models with coefficients organized at more than one level (group).
Depending on the model coefficients (effects), multilevel models can be classified into:
random effects models (variance components models), fixed effects models, and mixed models.
Random effects are estimated with partial pooling, shrinkage (linear unbiased prediction);
Fixed effects are estimated using least squares or maximum likelihood.
If a statistical model contains both fixed effects and random effects, it is called a mixed model.
Fixed effects are constant across individuals, and random effects vary.
Fixed effects are treated as nuisance parameters,
while estimation of marginal effects are of sole interest.
Static Panel
Simple Regression Models with Variable Intercepts:
(dep = timeinv + timevar + individual + error)
$$y_{it} = z_i α + x_{it} β + u_i + ε_{it}$$
 Pooled Model: disregard time periods;
 Pooled OLS estimator: OLS over panel;
 Individualspecific Effects Model:
 Random Effects (RE) model (random intercept/components model; equicorrelated model):
individualspecific effect uncorrelated with regressors, $\text(corr)(X_{it}, u_i) = 0$;
 Between Estimator: OLS over individual timeaverages;
 Random Effects Estimator: Feasible GLS over panel; MLE;
 Fixed Effects (FE) Model: individualspecific effect correlates with regressors,
$\text(corr)(X_{it}, u_i) \ne 0$;
 Within Estimator (Fixed Effects Estimator, Leasesquares Dummyvariable (LSDV) Estimator,
Covariance Estimator): OLS over panel after subtracting individual timeaverages;
 First Differences (FD) Estimator: OLS over panel after firstdifferences in time;
 Specification Analysis:
 Individualspecific effect;
Dynamic Panel
Dynamic Models with Variable Intercepts: (AR(1))
(dep = timeinv + timevar + lag_dep + individual + error)
$$y_it = z_i α + x_it β + γ y_it1 + u_i + ε_it$$
 Random Effects Models:
 General FGLS Estimator (OLS residuals for error covariance matrix estimation, then FGLS);
 ML Estimator;
 GMM Estimator (IV estimator);
 Fixed Effects Model:
 GMM Estimator (IV estimator);
 General FGLS;
 Fixed Effect GLS Estimator (FEGLS);
 Firstdifference GLS (FDGLS);
 Transformed ML Estimator;
GMM estimator.
y ~ covariates  gmm instruments  "normal" instruments
By default, all the variables of the model which are not used as GMM instruments
are used as normal instruments with the same lag structure as the one specified in the model.
Transformation: difference GMM; system GMM.
General FGLS is based on a twostep estimation process:
first a model is estimated by OLS (pooling), fixed effects (within) or first differences (fd),
then its residuals are used to estimate an error covariance matrix for use in a feasibleGLS analysis.
This framework allows
the error covariance structure inside every group (individual time series) to be fully unrestricted
and is therefore robust against any type of intragroup heteroskedasticity and serial correlation.
Conversely, this structure is assumed identical across groups
and thus general FGLS estimation is inefficient under groupwise heteroskedasticity.
Efficiency requires $N >> T$.
First difference with IV estimator.
Pooled OLS is biased upward and is inconsistent.
GLS and ML estimators are also generally biased.
Within estimator is biased, because eliminating the individual effect causes a correlation
between the transformed error term and the transformed lagged dependent variable.
Complication: Limited Dependent Variable
For FE models:
 Qualitative Choice Models (Discrete Data):
 Incidental Parameters Problem;
 Conditional MLE;

Sample Selection Models (Censored and Trancated Data):
 Trimmed LS estimator for FE model; [@Honore1992]
 BiasAdjusted Maximum Simulated Likelihood (MSL) Estimator [12.4];
 Auxiliary Models [12.6];
Complication: CrossSectionally Dependent Panel Data
 Spatial Approach;
 Factor Approach;
 Crosssectional Mean Augmented Approach;
 Test of CrossSectional Independence;
General FGLS is inefficient under crosssectional correlation.
🏷 Category=Economics