Computing the Inverse of a Square Symmetric Positive Definite Matrix with Cholesky Decomposition using NumPy

In machine learning, there are two common scenarios where you want to compute the inverse of a square symmetric positive definite matrix. (Note: a positive definite matrix is always square and symmetric so “square symmetric positive definite” is a bit redundant terminology).

The first scenario is computing the pseudo-inverse using normal equations in the equation pinv(X) = inv(Xt * X) * Xt where X is a design matrix of predictors (a design matrix has a leading column of 1.0 values added to account for a bias term), and Xt is the transpose of X. If you add a small conditioning constant (like 1.0e-8) to the diagonal of (Xt * X), it is square symmetric positive definite. This scenario arises when computing the weights and bias of a linear model, typically in linear regression or quadratic regression.

The second scenario for the inverse of positive definite matrix is when computing the inverse of a kernel matrix. A kernel matrix K is a square matrix where the entries at [i,j] and [j,i] are a kernel function (usually the radial basis function, RBF) applied to training data items [i] and [j]. If you add a small conditioning constant (like 1.0e-8) to the diagonal of K, it is square symmetric positive definite. This scenario arises when finding the model weights for kernel ridge regression or the closely related Gaussian process regression.

If you have a square symmetric positive definite matrix, you can compute its inverse using Cholesky decomposition. If matrix A is square symmetric positive definite, the Cholesky decomposition gives you a square matrix L, which is lower triangular. If you compute L * Lt, where Lt is the transpose of L, you get A.

If you use Cholesky decomposition and get L and Lt, then the inverse of the source matrix A is inv(Lt) * inv(L) because:

if A = L * Lt then

inv(A) = inv(L * Lt)
       = inv(Lt) * inv(L)

The inverse of Lt is the inverse of an upper triangular matrix, which has a special, simple algorithm. Likewise, the inverse of L is the inverse of a lower triangular matrix, which has a special algorithm.

I put together a demo. The key function is:

def my_cholesky_inv(A):
  # inv(A) = inv(Lt) * inv(L)
  L = np.linalg.cholesky(A)
  Lt = np.transpose(L)
  Linv = my_inv_lower_tri(L)
  Ltinv = my_inv_upper_tri(Lt)
  result = np.matmul(Ltinv, Linv)
  return result

To validate my function, I ran the source matrix through the general purpose np.linalg.inv() function and also by using the scipy techniques with cho_solve() and cho_factor(). The demo output:

Begin inverse using Cholesky decomp

Source matrix A =
[[0.1400 0.3100 0.1700 0.2800]
 [0.3100 0.7700 0.5200 0.6600]
 [0.1700 0.5200 0.8600 0.6800]
 [0.2800 0.6600 0.6800 0.7600]]

Inverse of A using my_cholesky_inv():
[[ 790124.6258 -197532.2265  197529.0176 -296293.5019]
 [-197532.2265   49388.9715  -49380.9388   74067.4559]
 [ 197529.0176  -49380.9388   49386.1383  -74077.9017]
 [-296293.5019   74067.4559  -74077.9017  111120.4232]]

Inverse of A using np.linalg.inv():
[[ 790124.6258 -197532.2265  197529.0176 -296293.5019]
 [-197532.2265   49388.9715  -49380.9388   74067.4559]
 [ 197529.0176  -49380.9388   49386.1383  -74077.9017]
 [-296293.5019   74067.4559  -74077.9017  111120.4232]]

Inverse of A using sp.linalg.cho_factor():
[[ 790124.6258 -197532.2265  197529.0176 -296293.5019]
 [-197532.2265   49388.9715  -49380.9388   74067.4559]
 [ 197529.0176  -49380.9388   49386.1383  -74077.9017]
 [-296293.5019   74067.4559  -74077.9017  111120.4232]]

When I examined the results with precision=8 (not shown), I was very surprised at the relatively poor precision of np.linalg.inv() function on the square symmetric positive definite demo matrix — only to about 6 decimals. The moral is that if you have a square symmetric positive definite matrix, to compute the inverse you should use Cholesky decomposition inverse instead of a general purpose matrix inverse function.

One of the nice characteristics of the NumPy and SciPy libraries is that they have a consistent interface that allows various functions to be combined and chained together.

When my son was a young boy, he loved to play with his Thomas the Tank Engine set. A lot of adult men take those sets quite seriously. A fellow from the Pacific Northwest, Tom Stephenson, created a Web site (wtrak.org) where he describes a set of track modules that can be connected.

Demo program:

# cholesky_inverse_demo.py

# inverse of a (square symmetric) positive definite matrix
# using Cholesky decomposition

import numpy as np

def my_cholesky_inv(A):
  # inv(A) = inv(Lt) * inv(L)
  L = np.linalg.cholesky(A)
  Lt = np.transpose(L)
  Linv = my_inv_lower_tri(L)
  Ltinv = my_inv_upper_tri(Lt)
  result = np.matmul(Ltinv, Linv)
  return result
  
def my_inv_lower_tri(A):
  # inverse of a lower triangular matrix
  n = len(A)
  result = np.eye(n)  # Identity matrix
  for k in range(n):
    for j in range(n):
      for i in range(k):
        result[k,j] -= result[i,j] * A[k,i]
      result[k,j] /= A[k,k]
  return result

def my_inv_upper_tri(A):
  # inverse of upper triangular
  n = len(A)
  result = np.eye(n)  # Identity matrix
  for k in range(n):
    for j in range(n):
      for i in range(k):
        result[j,k] -= result[j,i] * A[i,k]
      result[j,k] /= A[k,k]
  return result

print("\nBegin inverse using Cholesky decomp ")
np.set_printoptions(precision=4, suppress=True,
  floatmode='fixed')

# make a square symmetric positive definite matrix
B = np.array([
  [0.1, 0.2, 0.3],
  [0.4, 0.6, 0.5],
  [0.9, 0.1, 0.2],
  [0.6, 0.2, 0.6]])

A = np.matmul(B, np.transpose(B))  # A is sq symm

for i in range(len(A)):  # make it PD
  A[i,i] += 1.0e-6
print("\nSource matrix A = ")
print(A)

Ainv = my_cholesky_inv(A)
print("\nInverse of A using my_cholesky_inv(): ")
print(Ainv)

Ainv = np.linalg.inv(A)
print("\nInverse of A using np.linalg.inv(): ")
print(Ainv)

# scipy has helpers
from scipy.linalg import cho_solve, cho_factor
L = cho_factor(A, lower=True)
I = np.eye(len(A))
Ainv = cho_solve(L, I)
print("\nInverse of A using sp.linalg.cho_factor(): ")
print(Ainv)