The **Multivariate Normal (MVN)** is a Joint distribution that generalizes the Normal distribution. It helps us determine if a linear combination of multiple Random variables is also approximately normal, given that they follow the Normal.

Random vector

A

random vectoris just an ordered list of random variables. The order matters. For example, $(X,Y)$ is different from $(Y,X)$. A vector $X=(X_{1},X_{2},…X_{k})$ is called a $k$-dimensional vector.

## Definition

A random vector $X=(X_{1},X_{2},…X_{k})$ has the Multivariate Normal joint distribution if every linear combination of the $X_{j}$‘s also has a Normal distribution.

In other words, $X$ is MVN if for every choice of constants $t_{1},t_{2},…,t_{k}$, we require $t_{1}X_{1}+⋯+t_{k}X_{k}$ to also have a Normal distribution.

We also allow $t_{1}X_{1}+t_{2}X_{2}+…t_{k}X_{k}$ to be constant. However, the probability for this needs to be 100%, otherwise it does not follow MVN.

When the vector is 2-dimensional, the MVN distribution is also called the

Bivariate Normal.

### Parameters

The parameters of a MVN distribution are:

- The means of $X_{1},X_{2},…,X_{k}$
- The Variances of $X_{1},X_{2},…,X_{k}$
- The covariance or Correlation between
*each*pair of random variables among $X_{1},X_{2},…,X_{k}$

The variance and covariance parameters are often listed in a $k×k$ matrix called the *covariance matrix* or *variance-covariance matrix*, where row and columns are the r.v.s $X_{1},X_{2},…,X_{k}$.

- From the definition of covariance this implies the diagonal entries are the variance.
- The bottom left triangle is equivalent to the upper right triangle.

## Properties

If $(X_{1},X_{2},…,X_{k})$ is Multivariate Normal, then each individual r.v. $X_{j}$ is also Normal. We can show this by treating $X_{1}$ as a linear combination of the other random variables:

$X_{1}=1⋅X_{1}+0⋅X_{2}+0⋅X_{3}+⋯+0⋅X_{k}$and therefore by the definition of MVN, $X_{1}$ is Normal.

However, *the converse is not true*: knowing that $X_{1}$, $X_{2}$, etc. are Normal does not imply that $(X_{1},X_{2},…)$ is Multivariate Normal.

If $X_{1},X_{2},…,X_{k}$ are Independent Normal random variables, then $(X_{1},X_{2},…,X_{k})$ is a Multivariate Normal. This follows from the fact that the sum of independent Normals is Normal.

Suppose that $(X_{1},X_{2},…)$ is MVN. We can show that a set of variables $Y_{1},Y_{2},…$ is also MVN if for every choice of constants $t_{1},t_{2},…$, we can rewrite it as a linear combination of $X_{1},X_{2},…$.

- If every linear combination of $Y_{1},Y_{2},…$ can be expressed equivalently as a linear combination of $X_{1},X_{2},…$, then there’s a direct mapping from one to the other.
- And since $(X_{1},X_{2},…)$ is MVN, it follows that $(Y_{1},Y_{2},…)$ must also be MVN.

If $(X,Y)$ is Bivariate Normal and the Covariance $Cov(X,Y)=0$, then $X$ and $Y$ are independent. Remember that in the general case, just because the covariance is zero doesn’t mean the variables are independent. But it’s true for this case.