Circular statistics concerns statistical data analysis on the unit circle.
The unit circle $\mathbb{S}^1$ is isometric to the submanifold in the plane defined by $\{(x, y) \in \mathbb{R}^2 : x^2 + y^2 = 1\}$, where each point on the circle is identified with a pair of coordinates, $p \leftrightarrow (x, y)$. Points on the unit circle is often represented as an angle relative to a reference direction, such as the x axis, via counterclockwise rotation. This angle $\theta$ is uniquely defined in $[0, 2\pi]$, while $\theta + 2 b \pi$ represents the same point for any integer b. Alternatively, the unit circle can be identified with complex numbers of unit moduli, $\{z \in \mathbb{C} : |z| = 1\}$, ignoring the geometry. In this case, each point is related to its angle representations via $z = e^{i \theta}$. Thus the angle $\theta$ can also be interpreted as phase or complex argument. In summary, every point $p \in \mathbb{S}^1$ on the unit circle can be represented uniquely as Cartesian coordinates $(\cos\theta, \sin\theta)$ or a complex number $e^{i \theta}$, or nonuniquely as a phase angle $\theta$.
By embedding the unit circle in the plane, a circular random variable becomes a random vector of unit length. We call the sum of multiple vectors in a vector space their resultant vector: $\mathbf{r} = \sum_{i=1}^n \mathbf{v}_i$. Thus, the resultant vector of a random sample $(\mathbf{x}_i)_{i=1}^n$ of an embedded circular random variable is $\mathbf{r} = \sum_{i=1}^n \mathbf{x}_i$, with coordinates $(C, S) := (\sum_{i=1}^n \cos\theta_i, \sum_{i=1}^n \sin\theta_i)$. The resultant length $R = |\mathbf{r}|$ of the sample can be computed as $R = \sqrt{C^2 + S^2}$.
The geodesic distance $d_g(p, q)$ between two points on the unit circle is the length of the shortest arc connecting them. Representing the points as directions or angles, it is the smallest amount of rotation required to align them: $d_g(\theta, \phi) = \min\{\theta - \phi \mod 2 \pi, \phi - \theta \mod 2 \pi\}$. If the angles are in $[0, 2\pi]$, let $d = |\theta - \phi|$, the geodesic distance can be computed as $d_g = \min(d, 2 \pi - d)$ or $d_g = \pi - |\pi - d|$.
The extrinsic center of mass $\bar{\mathbf{x}}$, or mean resultant vector $\bar{\mathbf{r}}$, is the resultant vector devided by sample size: $\bar{\mathbf{x}} = \mathbf{r} / n$, mean resultant length $\bar{R} = R / n$, and mean direction $\bar{\theta} = \arg\bar{\mathbf{r}}$ (defined if $\bar{R} \ne 0$). The mean resultant vector has coordinates $(\bar{C}, \bar{S}) := n^{-1} (C, S)$, where sample cosine moment $\bar{C} = n^{-1} \sum_{i=1}^n \cos\theta_i$ and sample sine moment $\bar{S} = n^{-1} \sum_{i=1}^n \sin\theta_i$.
The mean direction $\bar{\theta}$ is a measure of location. For Euclidean data, centering a sample eliminates the resultant: $\sum_{i=1}^n (x_i - \bar{x}) = 0$. For circular data, centering a sample rotates the resultant to the x axis: $\left(\sum_{i=1}^n \cos(\theta_i - \bar{\theta}), \sum_{i=1}^n \sin(\theta_i - \bar{\theta})\right) = (R, 0)$.
The mean resultant length $\bar{R}$ is a measure of concentration. Since the extrinsic center of mass lies in the closed unit disk, we have $\bar{R} \in [0, 1]$. It is close to one if and only if the sample is tightly clustered; it is close to zero if the sample is widely dispersed or antipodally symmetric. Sample circular variance $V$ is a measure of dispersion, defined as the complement of unit of the mean resultant length: $V = 1 - \bar{R}$; apparently it is also in [0, 1]. For Euclidean data, sample variance is the minimum of the sample mean of squared distance to a point, $(x - a)^2$, obtained at $a = \bar{x}$. For circular data, sample circular variance is the minimum of the sample mean of a dissimilarity measure $1 - \cos(\theta - \alpha)$, obtained at $\alpha = \bar{\theta}$. Sample circular standard deviation $v = \sqrt{-2 \log(\bar{R})}$, which is in $[0, \infty]$.
Median direction $\tilde{\theta}$ of a sample of circular data is an intrinsic measure of location, defined as any direction for which (1) the same number of points lie in the left and right semi-circles from the point and (2) more points are closer to the point than its opposite point. Equivalently, it is any direction that minimizes the mean geodesic distance to the sample: $\tilde{\theta} \in \arg\min d_0(\alpha)$, where $d_0(\alpha) = n^{-1} \sum_{i=1}^n d_g(\theta_i, \alpha)$. The minimum value $d_0(\tilde{\theta})$ is called the circular mean deviation, which is the sample mean geodesic distance to any median direction. Note that the median direction is different from the Riemannian centers of mass (i.e., Fréchet mean and Karcher mean) because the latter uses the squared geodesic distance.
Circular mean difference $\bar{D}_0$ of a sample of circular data is the average geodesic distance between all pairs of points: $\bar{D}_0 = n^{-2} \sum_{i,j=1}^n d_g(\theta_i, \theta_j)$. Circular range $w$ is the length of the shortest arc containing all observations. Let $(\theta_{(1)}, \cdots, \theta_{(n)})$ be the linear order statistics of the sample. The arc lengths between adjacent observations can be computed as $T_i = \theta_{(i+1)} - \theta_{(i)}$ for $i = 1, \cdots, n-1$, and $T_n = 2 \pi + \theta_{(1)} - \theta_{(n)}$. The circular range can be computed as $w = 2 \pi - \max(T_i)_{i=1}^n$.
First trigonometric moment $m'_1$ about the zero direction of a sample of circular data is the complex number combining the sample cosine and sine moments: $m'_1 := \bar{C} + i \bar{S}$. It is the complex number representation of the mean resultant vector $\bar{\mathbf{r}}$, and we have $m'_1 = \bar{R} e^{i \bar{\theta}}$. First central trigonometric moment $m_1$ of a sample of circular data is the first trigonometric moment of the sample centered at its mean direction: $m_1(\boldsymbol{\theta}) = m'_1(\boldsymbol{\theta} - \bar{\theta})$. It equals the mean resultant length: $m_1 = \bar{R}$.
In general, for any positive integer $p$, define the following quantities: the p-th sample cosine moment $\bar{C}_p = n^{-1} \sum_{i=1}^n \cos(p \theta_i)$, the p-th sample sine moment $\bar{S}_p = n^{-1} \sum_{i=1}^n \sin(p \theta_i)$, the p-th mean resultant length $\bar{R}_p := \bar{R}(p \boldsymbol{\theta})$, and the p-th mean direction $\bar{\theta}_p := \bar{\theta}(p \boldsymbol{\theta})$. They are well defined because angle representations $p \theta$ are unambiguous when $p$ is an integer: $p (\theta + 2 \pi) = p \theta + 2 p \pi$. The p-th trigonometric moment $m'_p$ of a sample of circular data, is the complex number combining the p-th sample cosine and sine moments: $m'_p := \bar{C}_p + i \bar{S}_p$. It is the complex number representation of the mean resultant vector of the p-fold angles: $m'_p = \bar{R}_p e^{i \bar{\theta}_p}$. The p-th central trigonometric moment $m_p$ of a sample of circular data is the p-th trigonometric moment of the sample centered at its mean direction: $m_p(\boldsymbol{\theta}) = m'_p(\boldsymbol{\theta} - \bar{\theta})$. Trigonometric moments can be used to define measures of dispersion, skewness, and kurtosis. Sample circular dispersion is defined as $\hat{\delta} = (1 - \bar{R}_2) / (2 \bar{R}^2)$.
Fisher-Lee circular correlation coefficient $\rho_T$. Proportional to $R^2(\Theta - \Phi) - R^2(\Theta + \Phi)$
Fisher-Lee circular rank correlation coefficient $\hat{\Pi}_n$.
Mardia circular rank correlation coefficient.
Jammalamadaka-Sarma circular rank correlation coefficient.
Circular statistics:
Spherical statistics:
Directional statistics (circular and spherical):
Shape analysis and object data analysis: