Time Series Models

Time-series is a sequence of measurements of a variable, approximately equally spaced in time. The term is typically used on economic and financial data.

Frequency of time-series: yearly, quarterly (public company report, national product), monthly (government statistics); weekly, daily (gas price, stock index); per hour (weather, tide height), minute, second or higher frequencies (equity trading);

Characteristics of time-series: trend (drift in moving average); seasonality (spectral density estimation); autocorrelation (dependency on previous state, e.g. random walk); stationarity (time-invariant distribution);

Uncertainty in time-series models are often presented as 95% (also 80%) prediction interval band (not confidence interval).

Hierarchical time series is a set of time-series that form a hierarchy of partitions. Grouped time series is a hierarchical time series with independent partitions.

Stationarity

Backshift operator (lag): $B x_t = x_{t-1}$.

serial correlation auto-correlation (ACF) and partial auto-correlation (PACF) are autocorrelation coefficient at various time lags.

Decomposition Models

Additive and multiplicative decomposition, which are identical under log transformation:

$$\begin{aligned} X &= T + C + S + I \\ X &= T * C * S * I \end{aligned}$$

Components of time-series:

Trend (T): non-periodic low-frequency evolution;
Seasonal (S): periodic fluctuations, observable frequencies depend on time-series frequency;
1. "Cycle": multi-year periods attributed to the "business cycle", often estimated together with trend as the "trend-cycle" component;
2. Annual pattern: solar season (temperature, precipitation, length of day);
3. Weekly pattern: day of week;
4. Daily pattern: solar time, time zone (including daylight saving time);
Irregular (I): residual after removing model components;
Calendar effects: any effect related to the calendar and its changes;
- Fixed holidays (national days, Thanksgiving, Christmas - New Year);
- Moving holiday effects are caused by holidays whose dates vary from year to year, such as U.S. Monday holidays and holidays using non-Gregorian calendars (Easter, Rosh Hashanah, Eid al-Fitr, Chinese New Year, Diwali);
- For monthly time-series, trading day (TD) effects are caused by changes in the number of business days in each month;
Outliers are abrupt, atypical movements in the time series, caused by extreme events:
- severe weather (hurricane, storm in summer/fall; blizzard in winter);
- social unrest (strike);

Causes of seasonal fluctuations in economic time-series include calendar (business day, holiday), timing (school vacation, payday, tax/accounting period), weather, and social expectation of seasonal patterns [@Granger1979]. Different time-series can have different causes of seasonal, and thus shall be modeled differently.

Seasonal adjustment is the removal of seasonal components from a time-series. This is done to reveal the low frequency components (commonly known as "trend-cycle") of economic time-series, which are usually both statistically and economically important. It also removes one possible source of spurious relationship among multiple time-series.

Classical decomposition (stats::decompose()) uses moving average filter for trend component, and assumes seasonal component is periodic; its estimate is not available at end points, and not robust to outliers.

Moving average smoothing of order $2k+1$: $$\hat{T}_{t} = \frac{1}{2k+1} \sum_{j=-k}^k y_{t+j}$$ m-MA means a moving average of order $m$, and $n \times m$-MA means m-MA followed by n-MA. More sophisticated weights (kernel()) can be used for smoother results.

X-12-ARIMA and TRAMO-SEATS

X-13ARIMA-SEATS is a seasonal adjustment software combining X-12-ARIMA (developed by the United States Census Bureau) and TRAMO-SEATS (developed by the Band of Spain). This seasonal adjustment software is currently used by the US Census Bureau, and also by many government agencies around the world.

X-12-ARIMA decomposition (seasonal::seas()) for quarterly and monthly time-series refines the classical decomposition by iteration; its estimate is available at end points and relatively robust to outliers; the seasonal component can vary slowly over time [@Ladiray2001]. Also for quarterly and monthly time-series, SEATS decomposition [@Dagum2016].

The X-11 method decomposes a time-series into trend-cycle, seasonal, and irregular components by iteratively applying linear filters (moving averages).

Time Series Regression with ARIMA noise (TRAMO); Seasonal Extraction in ARIMA Time Series (SEATS);

To protect seasonal effect estimates from distortion by outliers, generic outlier regressors can be used to estimate and temporarily remove the outliers.

Local Regression

Seasonal and trend decomposition using LOESS, aka STL decomposition (stats::stl()), can handle any type of seasonality, with variable seasonal component and smoothness of the trend-cycle component, robust to outliers [@Cleveland1990].

Multiple seasonality time-series

Dynamic harmonic regression with multiple seasonal periods;

Trigonometric Box-Cox transform, ARMA errors, Trend, and Seasonal components (TBATS) models [@DeLivera2011];

Exponential Smoothing

exponential smoothing methods weights exponentially decaying in observation age. forecast::ets() [@Hyndman2008]

Autoregressive and Moving Average Models

Auto-regressive model $\text{AR}(p)$ of order $p$ is a regression model of a single time-series against its previous $p$ values: $$Y_t = \sum_{i = 1}^p \phi_i Y_{t-i} + c + e_t$$ Vector auto-regressive models (VAR) are auto-regressive models of multiple time-series, implemented in R package vars.

Auto-regressive conditional heteroskedasticity (ARCH) Generalized ARCH (GARCH)

Moving average model $\text{MA}(q)$ of order $q$ is a regression-like model on past forecast errors: $$Y_t = c + e_t + \sum_{i = 1}^q \theta_i e_{t-i}$$

Auto-regressive integrated moving average model $\text{ARIMA}(p, d, q)$ is a combined auto-regressive and moving average model of differenced time series $Y'_t = (1-B)^d Y_t$: $$Y'_t = \sum_{i = 1}^p \phi_i Y'_{t-i} + c + e_t + \sum_{i = 1}^q \theta_i e_{t-i}$$

(Box-Jenkins) stats::arima(), forecast::auto.arima() [@Hyndman2008].

Seasonal ARIMA model $\text{ARIMA}(p, d, q)(P, D, Q)_m$ is an ARIMA model with additional seasonal terms, where $m$ is the number of observations in a year.

Regression+ARIMA models use linear regression to estimate moving holiday, trading day and outlier effects, and then use a seasonal ARIMA model to estimate trend, cycle and seasonal components from the regression residuals.

Empirical mode decomposition

Empirical mode decomposition (EMD) is an iterative procedure to decompose a time series into a finite number of decreasingly oscillatory "intrinsic mode functions" (IMFs) and a monotonic trend: 1. identify all the local extrema in the test data; 2. connect all the local maxima by a cubic spline line as the upper envelope; 3. repeat the procedure for the local minima to produce the lower envelope; 4. subtract the data with the mean of the upper and lower envelopes; 5. repeat this process (a sifting step) until the output converges (several stoppage criteria exist); 6. this outputs the first IMF. subtract it from the data and repeat until the residual is monotonic. EMD overview by Max Lambert

Hilbert–Huang transform (HHT) applies Hilbert spectral analysis to IMFs to obtain instantaneous frequency data. Unlike the Fourier and wavelet transforms which use generic bases, HHT uses empirical modes and is applicable to nonstationary and nonlinear time series.

Neural Network

Generic Bayesian neural network (BNN) model with Long short term memory (LSTM) encoder-decoder layers for large numbers of time-series provides better forecast accuracy at extreme events, as deployed in Uber {Laptev2017, Zhu2017}.