Microeconometrics [@Cameron2005]
Online Resources on Cameron:2005
Research design is the overall strategy for the collection, measurement, and analysis of data in the social sciences.
Types of research designs: (Ordered by levels of evidence as in evidence-based practices/medicine.)
Data dimension:
Assumptions in Statistical models for causal inference: [@Holland1986] The unit homogeneity assumption; The assumptions of temporal stability and causal transience;
Parametric Methods:
Semiparametric Methods:
Nonparametric Methods:
Binary outcome models:
Multinomial outcome models:
Choice-based sampling: weighted MLE;
Model selection:
Sample Selection Models:
Simultaneous equations models:
Specification analysis:
Duration Regression Models:
Count Data Models:
Count data models are suitable for samples taking non-negative integer values not much greater than zero. Such models are consistent with the bound and discreteness of the variable, while not adding too much complexity. In case of over-dispersion (variance-to-mean ratio greater than one), negative binomial and quasi-Poisson distributions are considered as alternatives to Poisson distribution, with variance increasing quadratically and linearly in expectation respectively. Binomial distribution is considered when the variable is also bounded above.
Short Panels.
Multilevel models (hierarchical linear models, nested models) are statistical models with coefficients organized at more than one level (group). Depending on the model coefficients (effects), multilevel models can be classified into: random effects models (variance components models), fixed effects models, and mixed models. Random effects are estimated with partial pooling, shrinkage (linear unbiased prediction); Fixed effects are estimated using least squares or maximum likelihood. If a statistical model contains both fixed effects and random effects, it is called a mixed model.
Fixed effects are constant across individuals, and random effects vary. Fixed effects are treated as nuisance parameters, while estimation of marginal effects are of sole interest.
Simple Regression Models with Variable Intercepts: (dep = time-inv + time-var + individual + error)
$$y_{it} = z_i α + x_{it} β + u_i + ε_{it}$$
Dynamic Models with Variable Intercepts: (AR(1)) (dep = time-inv + time-var + lag_dep + individual + error)
$$y_it = z_i α + x_it β + γ y_it-1 + u_i + ε_it$$
GMM estimator.
y ~ covariates | gmm instruments | "normal" instruments
By default, all the variables of the model which are not used as GMM instruments
are used as normal instruments with the same lag structure as the one specified in the model.
Transformation: difference GMM; system GMM.
General FGLS is based on a two-step estimation process: first a model is estimated by OLS (pooling), fixed effects (within) or first differences (fd), then its residuals are used to estimate an error covariance matrix for use in a feasible-GLS analysis. This framework allows the error covariance structure inside every group (individual time series) to be fully unrestricted and is therefore robust against any type of intragroup heteroskedasticity and serial correlation. Conversely, this structure is assumed identical across groups and thus general FGLS estimation is inefficient under groupwise heteroskedasticity. Efficiency requires $N >> T$.
First difference with IV estimator.
Pooled OLS is biased upward and is inconsistent. GLS and ML estimators are also generally biased. Within estimator is biased, because eliminating the individual effect causes a correlation between the transformed error term and the transformed lagged dependent variable.
For FE models:
Sample Selection Models (Censored and Trancated Data):
General FGLS is inefficient under cross-sectional correlation.
Notes of @Hsiang2016.
Climate $C_iτ$ and weather $c_iτ$ are vectors of $K$ parameters specifying respectively the probability and empirical distributions of atmosphere-ocean states (temperature, rainfall, humidity, etc.) at location $i$ during period $τ$. Climate may affect an outcome directly through weather or through individual decision (belief): $Y(C) = Y(c(C), b(C))$, with (marginal) direct effect $∂Y/∂c$ and belief effect $∂Y/∂b$. Adaptation refers to the belief effects and the interactions between belief and direct effects $∂^2Y/∂b∂c$. Average treatment effect for climate change under current climatic and non-climatic factors: $β = E[Y|C+ΔC, x] - E[Y|C, x]$.
Non-experimental research designs to estimate $β$:
Frequency-identification trade-off: Low observation frequency might capture belief effects, but the unit gets less comparable to itself between observations.
A partial test of marginal treatment comparability: If the estimates are stable across all temporal frequencies from unfiltered time-series to long differences to cross section, the marginal treatment comparability assumption is more plausible.
The marginal effects of climate and weather are identical, if the agent adapts its belief/action to the climate to consistently maximize the outcome which is a differentiable function of beliefs/actions. The total effect of climate change is the integration of marginal effects of weather, which can be computed using time-series estimates.
Climate should be parameterized into variables/measures that most strongly influence social or economic outcomes.
Important aspects in reduced-form econometric models of climate effects (dose-response function, regression function): 1. nonlinear effects: nonlinear response functions at the resolution of weather data can be recovered despite aggregated outcome data; 2. spatial and temporal displacement: distribution of net effect in time (harvesting, advancing an expected event; delayed effect, effects dominate after the event) and space (transmit effect across locations; remote effect, effects dominate at other locations); 3. statistical uncertainty: estimates of standard error may be biased, due to spatial and temporal autocorrelation in climate data; 4. measurement of adaptation: - if the adaptive action is known and observed directly as an outcome, the climate effect on adaptation can be estimated; - for an outcome influenced by adaptation, cross-sectional estimate captures all belief effects along with all direct effects, while time-series estimates stratified by proxies of adaptive actions can measure the overall net effect of all adaptive actions; 5. meta-anlysis (cross-study comparison and synthesis): results across studies can be combined to fit a response function applicable to all populations;
Attribute historical impacts and project future impacts of climate change under different scenarios, typically using models of partial equilibrium responses. General equilibrium responses include factor reallocation and price change, but is rarely studied.
Climate can affect an outcome via many mechanisms/pathways, and a specific mechanism/pathway may be isolated and estimated in a structural model.
Note that adaptation costs are almost never measured.