Manjeet Dahiya

Expectation and Various Summaries of a Distribution

The distribution of a random variable is quite detailed — it contains all the probabilistic information of the random variable. Analogous to the summaries of the sample data sets like mean and variance, we can define summaries of the probability distributions. This post presents various summaries or properties of probability distributions.

Expectation

Expectation of a random variable $X$, also called the mean or the expected value, is defined as:

Discrete: $ E(X) = \sum \limits_{x \in domain(X)} x f(x) $

Continuous: $ E(X) = \int_{-\infty}^{\infty} x f(x) dx $

Where $f(x)$ if pf or pdf or the random variable $X$.

Note that expectation is a property of the distribution. Two different random variables with the same distribution will have the same expectation.

Theorem

Variance

Variance of a random variable describes the spread of the distribution. It measures how spread out the distribution is. It is defined as:

$ Var(X) = E[(X - \mu)^2] $

Where $\mu$ is $E(X)$.

Variance is usually denoted by $\sigma^2$. That is, $\sigma^2 = Var(X)$

Standard deviation ($\sigma$) is defined as the positive square root of variance. It enables comparison with the random variable as it has the same units as the random variable.

Theorem

Covariance

When dealing with a joint distribution of two random variables, covariance describes how much these random variables are linearly dependent on each other.

Given two random variable $X$ and $Y$, the covariance of $X$ and $Y$ is defined as:

\[Cov(X, Y) = E[(X -\mu_x)(Y -\mu_y)]\]

Where $\mu_x = E(X)$ and $\mu_y = E(Y)$.

Covariance measures the expectation of the product of offsets from the respective means. If both the random variables are simultaneously greater or smaller than the respective means, then the covariance is positive. If it is the other way round, then the covariance is negative.

Covariance measures the extent to which the random variable simultaneously take bigger or smaller values.

Theorem

Correlation

Covariance between two random variable could be impacted by the scale of the random variables. It does not allow comparison of the degree of dependence between two different pairs of random variables. Correlation is a measure that solves this issue. It is defined as:

\[\rho (X, Y)= \frac{Cov(X, Y)}{\sigma_x \sigma_y}\]
Theorem

Example of dependent variables with zero correlation: Random variable $X$ and $X^2$ are clearly dependent, where X takes -1, 0, 1 with equal probabilities. However, $Cov(X, X^2) = 0$. This example, shows that correlation/covariance does not capture nonlinear dependence.

Covariance matrix

In case of random vector ($\textbf{X}$, of dimension $d$), or otherwise, a collection of random variables, we define the notion of covariance matrix as:

\[CovMat(\textbf{X}) = \begin{bmatrix} \sigma_{11} & ... &\sigma_{1d} \\ ... & ... & ... \\ \sigma_{d1} & ... &\sigma_{dd} \end{bmatrix}\]

Where $\sigma_{ij} = Cov(X_i, X_j)$.

Another way to state: $CovMat(X) = E[(X - \mu)(X - \mu)^T]$

Properties

© 2018-19 Manjeet Dahiya