Matrix Pseudo-Inverse Using Iterative Gradient

I ran across an obscure but interesting research paper titled “An iterative method to compute Moore-Penrose inverse based on gradient maximal convergence rate” by Xingping Sheng and Tao Wang. It was published in 2013.

Whoa. Let me try to explain.

There are many algorithms in machine learning that use a matrix and require you to compute the inverse of the matrix. Unfortunately, in many situations the matrix you have doesn’t have an inverse. For example

  3   2
  6   4

does not have an inverse. In the 1950s, mathematicians came up with the idea of the pseudo-inverse of a matrix for use when you can’t compute a true inverse. There are several kinds of pseudo-inverses. The most common is called Moore-Penrose which I’ll just call the pseudo-inverse.

Computing a pseudo-inverse is extremely difficult. There are dozens of algorithms to do so. (The fact that there are dozens of ways to compute the pseudo-inverse of a matrix tells you the problem must be very difficult, otherwise there’d just be one “good” algorithm).

The most common technique used to compute a pseudo-inverse is called singular value decomposition (SVD). Computing an SVD is crazy difficult too so there are many algorithms to do that. SVD techniques are usually efficient, but they’re all very ugly to me.

A set of different techniques to compute a pseudo-inverse are called iterative techniques — basically looping so that in each iteration you get closer and closer to the correct pseudo-inverse. And there are dozens of iterative techniques to compute a pseudo-inverse. Iterative techniques are less efficient than SVD but somewhat easier to implement.

The 2013 research paper I found presented a new iterative technique that I thought was very elegant. The idea is to use a Calculus gradient of a clever error function, and loop, getting closer and closer to the true pseudo-inverse. The reason the idea appealed to me is that it’s very clean and elegant, and closely resembles iterative techniques for training a neural network — techniques I am very familiar with.

The key formula from the research paper. If you don’t read research papers like this on a regular basis, the equations look like gibberish. But the equations are actually quite clean and simple.

To summarize: There are many ways to compute the pseudo-inverse of a matrix. The most common approach is to use singular value decomposition, and there are many ways to do SVD, and all of them are efficient but extremely difficult. The second most common approach is to use an iterative technique, and there are many ways to do these, but the research paper idea is new and elegant.

Pseudo-inverse using the built-in pinv() function in the NumPy library

I coded up an demo. First I set up a simple test case using the non-invertible matrix above, and I got the pseudo-inverse using the Python NumPy library pinv() function. Then I coded up the research paper algorithm using raw C# (no external libraries), and voila! I got the same result.

Implementation of the research paper iterative technique, using C#, which gives the same result.

Well, that was interesting, fun, and useful. If I ever have to compute the pseudo-inverse of a matrix using C# or some other language that doesn’t have a built-in library to do so, I can pull up my demo code.