# Eigen-vesting III. Random Matrix Filtering in Finance

At a talk by Alan Edelman, he said that some students who attended his class on random matrix theory (notes) at MIT dropped out and started a hedge fund. They’re apparently doing pretty well today. In this post, we’re going to introduce random matrix theory and some of its most important applications to finance.

One area of random matrix theory is understanding the distribution of the eigenvalues in a (large) random matrix. Under certain assumptions, eigenvalues found in a predicted theoretical range are thought of as due to random interactions in the data. Because of this, we can discard eigenvalues found in such a predicted range to (try to) “filter” randomness from our data, leaving us with only actual trends. It’s easy to see how this could be a big deal.

The random matrix we are going to talk about is called a Wishart matrix. It is a square matrix of the form $W = XX^*$ where $X$ is an $N\times T$ matrix and the entries of $X$ are i.i.d. random Normal variables, i.e. $x_{i,j}\sim\mathcal{N}(0,1)$. Note, for a real-valued values matrix, $X^* = X^t$, the transpose of $X$. A more general notion would be when all random variables have a fixed variance $\sigma$, but without loss of generality, we’ll take it to equal 1.

In the case that $T\geq N$ and $N$ is large, the distribution of the eigenvalues of such a matrix $W$ (as defined above) will be distributed approximately like the Marchenko-Pastur distribution. A description and precise theorem may be found here.

# An Important Example: Correlation Matrices

A correlation matrix can be and example of Wishart matrices if the data matrix $X$ containing $T$ data points of $N$ variables has i.i.d. Normal entries. The next code will create a correlation matrix from such a matrix and display its eigenvalues with the theoretical distribution superimposed.

# Correlation Filtering

Although it’s not always true, it is common to model the log of the daily returns of a stock price as a normal distribution. This approximation holds well enough that much of the modern theory of finance was based around it. (Even today, the basic concepts still hold, with the Normal being replaced with a more skewed distribution.) Therefore, the correlation matrix of the log returns partially falls under the example above.

The idea behind correlation filtering is this:

1. Calculate the eigenvalues of the correlation matrix
2. If they fall within the theoretic range given by the Marchenko-Pastur distribution, we set those eigenvalues equal to 0, then we reconstruct the matrix.

Any eigenvalue coming found in that range is considered information that’s due to randomness, and so it is thrown out. Let’s go through an example. Because of the requirement that N and T be large, this time we are going to use a pre-downloaded dataset of stock prices.

## The Data

First, we pull data from Yahoo for a large universe of stocks. This list of tickers was cleaned from the list from here. I used the list posted and cleaned it into the file used below. Note, that by default, pandas places the data in a $T \times N$ format, and so we will use the built in functions to do things like calculate the correlation/covariance matrix.

## Log Transformation

The cleaning steps are largely the same as in the first post on Eigen-vesting, but this time we have to convert to the log returns. This extra step will be added without fanfare. The explanation for the steps can be found in the previous post.

The log transform is $r_{i,t} \mapsto \text{log}(r_{i,t} +1 )$.

Ironically, because the taylor expansion for the log around 1 is $$\sum_{n=1}^\infty (-1)^{n-1}\frac{x^n}{n}$$ most of the values will remain about the same. Note: we only need to do this for the in sample data points.

## The Eigenvalues of the Correlation Matrix

Finally! We’ve cleaned the data. That’s really the hardest part… now, the theory breaks down some when T is no longer greater than N, but we’ll manage. Pandas, unlike numpy, does a correlation on the columns rather than the rows, so we’re going to just use numpy’s corrcoeff().

We’re going to show a cute computation that gives more evidence to the fact that the standardized log returns satisfy approximately the assumptions of Wishart matrices. As an experiment, we’re going to take the matrix we just made and shuffle each row (i.e. the time series for each return). Then, if each row was a Normal(0,1) variable, now the entries should be i.i.d. because any relationship between rows (in time) we will have shuffled away. Take a look at the distribution:

Wow! So it looks like all of our assumptions approximately hold when there’s no time stucture, so any deviations implies a non-random structure in the data. Time to filter the matrix… this part in many ways amounts to voodoo. As advised by Merrill Lynch, the theoretical distribution may not quite fit your data because of sampling error, so you can “tune” the Q and $\sigma$ parameters (see the function definition above) to get a better fit. The fit looks good enough to me without tinkering for our exercise.

[ 2.07255233  2.23927193  4.80235965]


Only 36 eigenvalues remain of 69! You’ll also notice that one eigenvalue is larger than the rest by a decent margin. This is called the market eigenvalue and perhaps we’ll talk more about it in the future. Now, let’s “filter” the matrix and put it back together.

One thing I find curious is the literature advises setting the filtered correlation matrix’s diagnoal entries to 1 to preserve the typical interpretation of the correlation matrix (for example, (here)[http://polymer.bu.edu/hes/articles/pgrags02.pdf]). I attempted other methods like a similarity transform to make them all 1 or directly scaling the eigenvalues, and neither seemed to work very well.

## Comparison for Constructing the Minimum Variance Portfolio

Here we will do a quick in/out-of-sample comparison between the minimum variance portfolio of the filtered and standard covariance. Let $D= diag(\Sigma)$, that is the matrix with $D_{i,i} = \Sigma_{i,i}$ and $D_{i,j} = 0$ for $i\neq j$. We will make use of the fact that the correlation matrix $C$ and the covariance matrix $\Sigma$ are related by the formula $$\Sigma = D^{1/2} C D^{1/2}.$$

By replacing $C$ with the filtered correlation matrix $\hat C$, then we have the filtered covariance matrix $\hat \Sigma$.

             Investment Weight
BANSWRAS.NS           0.014983
ETEC.OB               0.005088
ALRN                  0.000680
CE2.DE                0.012353
SUL.AX                0.017722


Now, let’s plot their return over time. Since both contain short sales, we’re going to remove short sales and redistribute their weight.

Of course, the covariance matrix is time dependent regardless of the filtering, and the filtering changes overtime. As there is no free lunch, it’s possible for the unfiltered portfolio to beat the filtered portfolio on occasion, but if it does, it’s probably due to random chance.

Written on March 30, 2016