Hypothesis testing is a statistical inference technique that uses a sample to give probabilistic conclusions on the underlying population.

In statistics, a hypothesis is a statement about population parameters. Typically we have two mutually exclusive hypotheses: one that is doubted, called the null hypothesis \( H_0 \); the other that is believed, called the alternative hypothesis \( H_1 \).

Hypothesis test is a rule that specifies for every possible sample whether to reject or (passively) accept the null. Theoretically, a hypothesis test divides the sample space into a rejection region and an acceptance region. In practice, we develop a test statistic as a real-valued function of the sample, and a rejection interval on the test statistic.

The p-value of a test statistic is the probability of observing a test statistic at least as deviant as the current value, if the null hypothesis is true. The significance level of the alternative hypothesis is a conventional upper bound on the p-value with which we claim the alternative is statistically significant.

A typical hypothesis testing procedure:

  1. State the null and alternative hypothesis: \( H_0: \theta \in \Theta_0; \quad H_1: \theta \in \Theta_0^C \).
  2. State the hypothesis test: test statistic \( W(\mathbf{X})\); rejection interval \(R\).
  3. Sample and compute the test statistic.
  4. Reject or passively accept the null hypothesis.

Background: Scientific Methodology

A working assumption is a statement used to construct scientific theories. Working assumptions cannot be logically false or logically true; it should be falsifiable and frequently examined. You start from an assumption that you believe is at least partially true, and results will confirm, reject or suggest modification to your assumption.

In contrast, a core assumption is a proposition that is either logically true or accepted as a fundamental principle for pragmatic use. Core assumptions are thus rarely examined.

Test Statistics

t-statistic

The t-statistic under null \( H_0: \hat{\beta}_i = \beta^{*} \) is:

\[ t = \frac{\hat{\beta}_i - \beta^{*}}{\widehat{\text{s.e.}}(\hat{\beta}_i)} \]

F-statistic

F-statistic:

\[ F = \frac{\text{MSS} / p}{\text{RSS} / (n - p - 1)} \]

One-way ANOVA is an omnibus test that determines for several independent groups you are interested in whether any of the group means are statistically significantly different from each other. If there are only two groups in the one-way ANOVA F-test, \(F = t^2\) and is identical to a t-test.

As an omnibus test, ANOVA does not have the problem of increased Type I error probability in multiple t-tests.

Three main assumptions of ANOVA:

  1. (The residuals in) Each group is normally distributed. (robust against violation of normality)
  2. All groups have the same variance.
  3. Observations are independent.

ANOVA table

Non-parametric alternative: Kruskal-Wallis H-test.

To determine which specific groups differ from one another, you need to use a post hoc test:

  • Homoskedasticity satisfied: honestly significant difference (HSD) post hoc test;
  • Homoskedasticity violated: Games Howell post hoc test;

Factor is a categorical/discrete variable; level is a possible value of a factor. A contrast is a linear combination of factor level means whose coefficients sum to zero. Two contrasts are orthogonal if these coefficients are orthogonal. Simple contrast is the difference between two factor means.

multiple comparison

Likelihood Ratio Test (LRT)

Def: likelihood ratio statistic

Def: likelihood ratio test

Def: Nuisance parameter

Uniformly Most Powerful (UMP) Test

Def: Type I Error, Type II Error

Def: power function

Def: size

Def: level

Def: unbiased

Def: uniformly most powerful (UMP) test

Thm: (Neyman-Pearson)

Def: monotone likelihood ratio (MLR)

Thm: (Karlin-Rubin)

Other Tests

Wald, LM, LR, J


🏷 Category=Statistics