Multivariate Normal Random Variables

Multivariate Normal Random Variables are extraordinarily convenient.

The probability density of a multivariate normal random variable is proportional to the following:

$f(X) = e^{ \left(- \mu + X\right)^T \Sigma^{-1} \left(- \mu + X\right)}$

Where $X$ is an n-dimensional state vector, $\mu$ is the mean of the distribution, and $\Sigma$ is an n by n covariance matrix. Pictorally a 2-D density might be represented like this:

With contour lines showing decreasing probability levels dropping off around the mean (blue x). This distribution is entirely defined by the two quantities, $\mu$ which gives the center of the distribution and $\Sigma$ which effectively gives the shape of the ellipses. That is, rather than carry around the functional form above, we can simply define X as $X \sim N(\mu, \Sigma)$ and forget the rest.

Multivariate normals are convenient for three reasons

1. They are easy to represent – we only need a mean and covariance matrix
2. Linear transformations of normals are again normals
3. All operations are represented with linear algebra

First off, multivariate normals are simple to represent. This ends up being a big deal for functions on very high dimensional spaces. Imagine writing down a general function on 1000 variables.

Second, linear functions of normals are again normals. This is huge. For example this means that we could project the image above to one of the coordinate axes (or any axis) and get out our old friend the bell curve. As we work on our random variables the three conveniences remain true.

Third, the computation to perform these linear transformations of random variables is done solely through linear algebra on the mean and covariance matrices. Fortunately, linear algebra is something about which we know quite a bit.

So, as long as we’re willing to say that our variables are normally distributed (which is often not far from the truth) we can efficiently represent and compute on huge spaces of interconnected variables.

Multivariate Normals (MVNs) have been a goal of mine for some time while working on this project. They’re where this project starts to intersect with my actual work. I do lots of manipulations on MVNs and would like to stop dealing with all the matrix algebra.

In order to build them correctly it was clear I would need a relatively powerful symbolic matrix expression system. I’ve been working on something over at this branch.

Now we can represent symbolic matrices and, using them, represent MVN Random Variables

# Lets make a Multivariate Normal Random Variable
>>> mu = MatrixSymbol('mu', n, 1) # n by 1 mean vector
>>> Sigma = MatrixSymbol('Sigma', n, n) # n by n covariance matrix
>>> X = Normal(mu, Sigma, 'X') # a multivariate normal random variable

# Density is represented just by the mean and covariance
>>> Density(X)
(μ, Σ)

>>> H = MatrixSymbol('H', k, n) # A linear operator
>>> Density(H*X) # What is the density of X after being transformed by H?
(H⋅μ, H⋅Σ⋅H')

# Lets make some measurement noise
>>> zerok = ZeroMatrix(k, 1) # mean zero
>>> R = MatrixSymbol('R', k, k) # symbolic covariance matrix
>>> noise = Normal(zerok, R, 'eta')

# Density after noise added in?
>>> Density(H*X + noise) # This is a Block matrix
⎛[H  I]⋅⎡μ⎤, [H  I]⋅⎡Σ  0⎤⋅⎡H'⎤⎞
⎜       ⎢ ⎥        ⎢    ⎥ ⎢  ⎥⎟
⎝       ⎣0⎦        ⎣0  R⎦ ⎣I ⎦⎠

# When we collapse the above expression it looks much nicer
>>> block_collapse(Density(H*X + noise))
(H⋅μ, R + H⋅Σ⋅H')

# Now lets imagine that we observe some value of HX+noise,
# what does that tell us about X? How does our prior distribution change?
>>> data = MatrixSymbol('data', k, 1)
>>> Density(X ,  Eq(H*X+noise, data)  ) # Density of X given  HX+noise==data
# I'm switching to the latex printer for this


$\left[\begin{smallmatrix}\mathbb{I} & \bold{0}\end{smallmatrix}\right] \left(\left[\begin{smallmatrix}\Sigma & \bold{0}\\\bold{0} & R\end{smallmatrix}\right] \left[\begin{smallmatrix}H^T\\\mathbb{I}\end{smallmatrix}\right] \left(\left[\begin{smallmatrix}H & \mathbb{I}\end{smallmatrix}\right] \left[\begin{smallmatrix}\Sigma & \bold{0}\\\bold{0} & R\end{smallmatrix}\right] \left[\begin{smallmatrix}H^T\\\mathbb{I}\end{smallmatrix}\right]\right)^{-1} \left( \left[\begin{smallmatrix}H & \mathbb{I}\end{smallmatrix}\right] \left[\begin{smallmatrix}\mu\\\bold{0}\end{smallmatrix}\right] - data\right) + \left[\begin{smallmatrix}\mu\\\bold{0}\end{smallmatrix}\right]\right)$

$\left[\begin{smallmatrix}\mathbb{I} & \bold{0}\end{smallmatrix}\right] \left(\mathbb{I} - \left[\begin{smallmatrix}\Sigma & \bold{0}\\\bold{0} & R\end{smallmatrix}\right] \left[\begin{smallmatrix}H^T\\\mathbb{I}\end{smallmatrix}\right] \left(\left[\begin{smallmatrix}H & \mathbb{I}\end{smallmatrix}\right] \left[\begin{smallmatrix}\Sigma & \bold{0}\\\bold{0} & R\end{smallmatrix}\right] \left[\begin{smallmatrix}H^T\\\mathbb{I}\end{smallmatrix}\right]\right)^{-1} \left[\begin{smallmatrix}H & \mathbb{I}\end{smallmatrix}\right]\right) \left[\begin{smallmatrix}\Sigma & \bold{0}\\\bold{0} & R\end{smallmatrix}\right] \left[\begin{smallmatrix}\mathbb{I}\\\bold{0}\end{smallmatrix}\right]$

# Again, this block matrix expression can be collapsed to the following
>>> block_collapse(Density(X, Eq(H*X+noise, data) ))
μ + Σ⋅H'⋅(R + H⋅Σ⋅H')^-1⋅(-H⋅μ + -data) ,
(I + -Σ⋅H'⋅(R + H⋅Σ⋅H')^-1⋅H)⋅Σ


$\begin{smallmatrix}\mu + \Sigma H^T \left(R + H \Sigma H^T\right)^{-1} \left( H \mu - data\right)\end{smallmatrix}$

$\begin{smallmatrix}\left(\mathbb{I} - \Sigma H^T \left(R + H \Sigma H^T\right)^{-1} H\right) \Sigma\end{smallmatrix}$

This is the multivariate case of my previous post on data assimilation. Effectively all I’ve done here is baked in the logic behind the Kalman Filter and exposed it through my statistics operators Density, Given, etc… so that it has become more approachable.

Some disclaimers.
1) This is all untested. Please let me know if something is wrong. Already I see an error with the latex printing.
2) For organizational reasons it seems unlikely that Matrix Expressions will make it into SymPy in their current form. As a result this code probably won’t make it into SymPy any time soon.

My active branch is over here:
https://github.com/mrocklin/sympy/tree/mvn_rv/

with the multivariate normal code here:
https://github.com/mrocklin/sympy/tree/mvn_rv/sympy/statistics/mvnrv.py

The matrices live here:
https://github.com/mrocklin/sympy/tree/matrix_expr/sympy/matrices

PhD student studying Computational Mathematics at the University of Chicago.
This entry was posted in Uncategorized. Bookmark the permalink.

6 Responses to Multivariate Normal Random Variables

1. mrocklin says:

I should also note that this is as far out in terms of functionality as I ever intended to go. There’s still a lot of work to do but I don’t intend to implement anything new, just to fill in the missing pieces of what already exists.

Expect boring blog posts from here on out :)

2. goebbe says:

Wow, this all looks great. Keep up the good work.
If this cannot be integrated properly into Sympy, I would welcome if it could at least be distributed via Sympy. It would be a pity if the matrix expression would just disappear from the surface in a 3rd party package. But there seems to be quite some movement/discussion around matrices in Sympy. Perhaps this will lead to a solution with similar properties/possibilities in the long run?

• mrocklin says:

Thanks for the interest goebbe!

I’m sure that matrix expressions will make their way into SymPy at some point, it’s just not clear how they’ll be organized to fit into the greater codebase. There is, as you say, a lot of interest behind symbolic linear algebra so I’m confident that this topic won’t go away. It’s worth being patient here to ensure that the solution is a good one.

“Perhaps this will lead to a solution with similar properties/possibilities in the long run?”
I hope so! This is why experimental code is good. We can see the possibilities and problems before committing to a big change.

3. Susanne says:

Hello there, I do think your blog could possibly be having internet browser compatibility issues.