Random Variable

The original probability space is incapable of analysis. If we map the sample space to some mathematical structure that is suited for classical deterministic analysis, an analytic approach to probability is naturally established. This is the motivation of random variables.

When extending a deterministic variable to a stochastic one, the first order uncertainty is variance, not expectation.

Random Variable (RV)

Random variable is a measurable function from a probability space to a measurable space based on the real line.

Symbolically, random variable is a \( (\Sigma, \Sigma_F) \)-measurable function \( X: \Omega \to F \), where \( (\Omega, \Sigma, P) \) is a probability space.

In most common cases, random variable is a real function, i.e. \( X: (\Omega, \Sigma) \to (\mathbb{R}, \mathcal{B}) \). Lebesgue measure space \( (\mathbb{R}, \mathcal{L}, \mathbf{m}) \) might also be used.

The inverse images of all measurable sets in the range is called the sigma-algebra introduced in the sample space by the random variable. Symbolically, it is \( \Sigma_X \).

Note:

As a measurable function, random variable is a special type of function. Although in most cases its range is either the real line or some Banach/Hilbert space, its domain, the sample space, does not have to have extra structure other than being a measure space.

Measurable function

A measurable function is a function such that every measurable set in the range has a preimage in the domain that is also measurable.

Symbolically, a function \( f: X \to Y \) is \( ( \Sigma_X, \Sigma_Y ) \) measurable if \( \forall E \in \Sigma_Y, f^{-1}(E) \in \Sigma_X \), where \( \Sigma_X, \Sigma_Y \) are sigma-algebras of set \( X \) and \( Y \).

Distribution

As a measurable function on probability space \( (\Omega, \mathcal{F}, P) \), a random variable \( \mathbf{X} \) introduces a probability measure \( \mu \) on \( \mathcal{B} \), called the distribution of \( \mathbf{X} \). Symbolically, \( \mu(B) = P(X^{-1}(B)), \forall B \in \mathcal{B} \).

The Lebesgue decomposition of distribution is: [refer to Radon–Nikodym Theorem] \[ \mu(A) = \int_{A} f(x) \mathrm{d}\lambda + \mu^s (A) \]

When \( \mu^s =0 \), the distribution \( \mu \) is said to be absolutely continuous.

Cumulative Distribution Function (CDF)

The distribution of a random variable can be conveniently characterized by cumulative distribution function \( F_{\mathbf{X}} (\mathbf{x}) = \mu \{ ( -\infty, \mathbf{x} ] \} \).

Cumulative distribution function always exist, and can be shown to be equivalent to the distribution of the random variable if the range sigma-algebra is Borel.

Probability Density Function (PDF)

Probability density function is the derivative of a cumulative distribution function, when the derivative exists: \( f_{\mathbf{X}} (\mathbf{x})= \frac{\mathrm{d}}{\mathrm{d} x} F_{\mathbf{X}} (\mathbf{x}) \)

Probability Mass Function (PMF)

Integration (Expectation)

The expectation of a random variable is its Lebesgue integral on the probability measure of the sample space. Symbolically, \[ \mathbb{E}X = \int_{\Omega} X \mathrm{d}P \]

We choose Lebesgue integral for two reasons:

To ensure closure of the various function spaces, in particular Banach and Hilbert spaces. This does not hold in Riemann integral.
Lebesgue integration provides a uniform integral whether a given RV is discrete or continuous.

Change of variables theorem

The change of variables theorem shift the integration from the probability space to the induce measure space on real line:

\[ \int_{\Omega} X \mathrm{d}P = \int_{\mathbb{R}} x \mathrm{d} \mu \]

(Riemann-Stieltjes integral)

Characteristic function and moments

Definition: The characteristic function of a random variable X with measure \(\mu\) is

\[ \hat{\mu}(t) = \mathbb{E}[e^{itX}] = \int_{\mathbb{R}} e^{itx} \mathrm{d} \mu \]

The characteristic function can be thought of as the Fourier transform of the PDF.

Definition: The characteristic function of a random vector \( \mathbf{X} \) is

\[ \Phi_{\mathbf{X}}(\mathbf(w)) = \mathbb{E}[e^{i \mathbf(w)^T \mathbf(X)}] = \mathcal{F} f_{\mathbf{X}}(\mathbf(x)) \]

Properties of characteristic function:

It uniquely determines the distribution of a random variable. \( f_{\mathbf{X}}(\mathbf(x)) = \mathcal{F}^{-1} \Phi_{\mathbf{X}}(\mathbf(w)) \)
Weak convergence of random variables implies pointwise convergence of corresponding characteristic functions.

Table: Standard Form of Dominant Moments

Name	Definition	Interpretation	Dimension	Range*
mean	first raw moment	central tendency	as is	\( (-\infty, \infty) \)
standard deviation	second central moment	variation	as is	\( [0,\infty) \)
skewness	normalized third central moment	lopsidedness	dimensionless	\( (-\infty, \infty) \)
excess kurtosis	excess normalized fourth central moment, centered at normal distribution	(for symmetric distribution) probability concentration on center and tails against the standard deviations	dimensionless	\( [-2, \infty) \)

*If exists.