Statistical classification or discrimination is the problem of assigning categories to observations. If observations are drawn from distinct populations, and we would like to find a deterministic criterion that assigns the observations to their origin, the discriminant rule based on maximum likelihood would be $I(x) = \arg\max_{i \in K} f_i(x)$. Discriminant analysis are methods that provide estimators $\hat I(x)$ of such discriminant rules.

Notations: $K$, number of classes.


Linear discriminant analysis (LDA) assumes Gaussian populations with identical covariance matrices, which gives a linear discriminant rule [@Fisher1936]. Quadratic discriminant analysis (QDA) assumes Gaussian populations, giving quadratic discriminant rules. LDA and QDA are suitable for data with small $n$ or well separated classes, and are capable of $K>2$.

Naive Bayes suitable for data with large $p$

(kernel) support vector machine (SVM) SVM is computationally efficient on nonlinear kernels, suitable for data with well separated classes, but is limited to $K=2$.

alt: Typical ML classification pipeline

🏷 Category=Computation Category=Machine Learning