Matrix Inverse Using Cholesky Decomposition With C#

Bottom line: I dusted off my old C# implementation of a function to compute the inverse of a matrix using Cholesky decomposition and made a few tweaks. I was happy with the result.

The Wikipedia article on matrix inverse lists 14 different algorithms, and each algorithm has multiple variations, and each variation has dozens of implementation alternatives.

Seven examples include 1.) LUP decomposition algorithm (Crout version and others), 2.) QR decomposition algorithm (Householder version and others), 3.) SVD decomposition algorithm (Jacobi version and others), 4.) Cholesky decomposition (Banachiewicz version and others), 5.) Cayley-Hamilton algorithm (Faddeev–LeVerrier version and others), 6.) Newton Iteration, 7.) Gauss-Jordan elimination. Whew! The fact that there are so many different techniques is a direct indication of how tricky it is to compute a matrix inverse.

Computing an inverse using Cholesky decomposition is a special case in the sense that it only works for matices that are square, symmetric, positive-definite (PD). Explaining PD matrices is complicated, but in machine learning they come from two common scenarios.

First, if you multiply a matrix times its transpose, and then add a small constant to the diagonal elements, you get a PD matrix. This scenario occurs when training a linear model (linear regression, quadratic regression).

Second, if you construct a kernel matrix (another complicated topic), and add a small constant to the diagonal elements, you get a PD matrix. This scenario occurs when training a kernel ridge regression model or a Gaussian process regression model.

The output of my demo is:

Begin matrix inverse using Cholesky decomposition

Generating matrix A:
   4.0   7.0   1.0
   6.0   0.0   3.0
   8.0   1.0   9.0
   2.0   5.0  -6.0

Conditioned matrix At * A (is positive-definite):
   120.0001    46.0000    82.0000
    46.0000    75.0001   -14.0000
    82.0000   -14.0000   127.0001

Inverse of AtA:
     0.0387    -0.0290    -0.0282
    -0.0290     0.0354     0.0226
    -0.0282     0.0226     0.0286

The check matrix (AtA * inv) should be I:
     1.0000    -0.0000     0.0000
     0.0000     1.0000     0.0000
     0.0000     0.0000     1.0000

End demo

I started with a 4×3 matrix A that is not PD, so Cholesky inverse doesn’t apply. I multiplied A by its transpose and added a small constant of 0.0001 to the diagonal elements to get a PD matrix AtA.

The demo computes the inverse of AtA using Cholesky decomposition (Banachiewicz version). Then I verified the result by multiplying the inverse and the source to get the identity matrix.

Good fun on a Friday evening.

In the early days of computer science, matrix inverse would be computed into the source matrix to save memory space. This destroyed the source matrix. By the 1990s, memory wasn’t so much of an issue, and algorithms could just use a copy of the source matrix.

When I was growing up in the 1950s, I never knew what my dad would bring home after his shift at the hospital (he was a doctor). One of our childhood treasures was a hectograph set to make copies — this was long before Xerox machines existed.

The system had a tray filled with a gel. You would use special pencils that were made of dye (blue, red, green, purple) to write or draw onto a piece of paper. Then you press the design down onto the gel. The dye would transfer onto the surface of the gel. Then you place a blank piece of paper onto the gel, press down, and the design would be printed onto the paper.

From a source impression, you could usually get about 12 copies before the dye would be worn off. The blue color worked the best, but red, green and purple added a nice touch of color.

My brother Roger, sister Beth, and I would make a one-page newsletter about the neighborhood (we lived at 1525 Camino Loma street in Fullerton, California) and then we’d go around and sell them to the neighbors for five cents a copy. Good fun.

Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols.

using System;
using System.IO;
using System.Collections.Generic;

namespace MatrixInverseCholesky
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin matrix inverse" +
        " using Cholesky decmposition ");

      double[][] A = new double[4][];
      A[0] = new double[] { 4.0, 7.0, 1.0 };
      A[1] = new double[] { 6.0, 0.0, 3.0 };
      A[2] = new double[] { 8.0, 1.0, 9.0 };
      A[3] = new double[] { 2.0, 5.0, -6.0 };

      Console.WriteLine("\nGenerating matrix A: ");
      MatShow(A, 1, 6);

      double[][] AtA = MatProd(MatTranspose(A), A);
      for (int i = 0; i "lt" AtA.Length; ++i)
        AtA[i][i] += 1.0e-4;
      Console.WriteLine("\nConditioned matrix At * A" +
        " (is positive-definite): ");
      MatShow(AtA, 4, 11);

      double[][] AtAinv = MatInvCholesky(AtA);
      Console.WriteLine("\nInverse of AtA: ");
      MatShow(AtAinv, 4, 11);

      double[][] C = MatProd(AtAinv, AtA);
      Console.WriteLine("\nThe check matrix " +
        "(AtA * inv) should be I: ");
      MatShow(C, 4, 11);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();

    } // Main()

    // ------------------------------------------------------

    static double[][] MatInvCholesky(double[][] A)
    {
      // A must be square, symmetric, positive definite
      // 1. decompose to L
      int m = A.Length; int n = A[0].Length;  // m == n
      double[][] L = new double[n][]; 
      for (int i = 0; i "lt" n; ++i)
        L[i] = new double[n];

      for (int i = 0; i "lt" n; ++i)
      {
        for (int j = 0; j "lte" i; ++j)
        {
          double sum = 0.0;
          for (int k = 0; k "lt" j; ++k)
            sum += L[i][k] * L[j][k];
          if (i == j)
          {
            double tmp = A[i][i] - sum;
            if (tmp "lt" 0.0)
              throw new
                Exception("decomp Cholesky fatal");
            L[i][j] = Math.Sqrt(tmp);
          }
          else
          {
            if (L[j][j] == 0.0)
              throw new
                Exception("decomp Cholesky fatal ");
            L[i][j] = (A[i][j] - sum) / L[j][j];
          }
        } // j
      } // i

      // 2. compute inverse from L
      double[][] result = new double[n][];  // Identity
      for (int i = 0; i "lt" n; ++i)
        result[i] = new double[n];
      for (int i = 0; i "lt" n; ++i)
        result[i][i] = 1.0;

      for (int k = 0; k "lt" n; ++k)
      {
        for (int j = 0; j "lt" n; j++)
        {
          for (int i = 0; i "lt" k; i++)
          {
            result[k][j] -= result[i][j] * L[k][i];
          }
          result[k][j] /= L[k][k];
        }
      }

      for (int k = n - 1; k "gte" 0; --k)
      {
        for (int j = 0; j "lt" n; j++)
        {
          for (int i = k + 1; i "lt" n; i++)
          {
            result[k][j] -= result[i][j] * L[i][k];
          }
          result[k][j] /= L[k][k];
        }
      }
      return result;
    } // MatInvCholesky()

    // ------------------------------------------------------

    // helpers for Main to set up problem, show result

    // ------------------------------------------------------

    static double[][] MatMake(int nRows, int nCols)
    {
      double[][] result = new double[nRows][];
      for (int i = 0; i "lt" nRows; ++i)
        result[i] = new double[nCols];
      return result;
    }

    // ------------------------------------------------------

    static double[][] MatTranspose(double[][] m)
    {
      int nr = m.Length;
      int nc = m[0].Length;
      double[][] result = MatMake(nc, nr);  // note
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[j][i] = m[i][j];
      return result;
    }

    // ------------------------------------------------------

    static double[][] MatProd(double[][] A,
        double[][] B)
    {
      int aRows = A.Length;
      int aCols = A[0].Length;
      int bRows = B.Length;
      int bCols = B[0].Length;
      if (aCols != bRows)
        throw new Exception("Non-conformable matrices");

      double[][] result = MatMake(aRows, bCols);
      for (int i = 0; i "lt" aRows; ++i) // each row of A
        for (int j = 0; j "lt" bCols; ++j) // each col of B
          for (int k = 0; k "lt" aCols; ++k)
            result[i][j] += A[i][k] * B[k][j];

      return result;
    }

    // ------------------------------------------------------

    static void MatShow(double[][] M, int dec, int wid)
    {
      for (int i = 0; i "lt" M.Length; ++i)
      {
        for (int j = 0; j "lt" M[0].Length; ++j)
        {
          double v = M[i][j];
          Console.Write(v.ToString("F" + dec).
            PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    // ------------------------------------------------------

    // function to load from file -- not used this demo

    static double[][] MatLoad(string fn, int[] usecols,
      char sep, string comment)
    {
      List"lt"double[]"gt" result = 
        new List"lt"double[]"gt"();
      string line = "";
      FileStream ifs = new FileStream(fn, FileMode.Open);
      StreamReader sr = new StreamReader(ifs);
      while ((line = sr.ReadLine()) != null)
      {
        if (line.StartsWith(comment) == true)
          continue;
        string[] tokens = line.Split(sep);
        List"lt"double"gt" lst = new List"lt"double"gt"();
        for (int j = 0; j "lt" usecols.Length; ++j)
          lst.Add(double.Parse(tokens[usecols[j]]));
        double[] row = lst.ToArray();
        result.Add(row);
      }
      sr.Close(); ifs.Close();
      return result.ToArray();
    }

    // ------------------------------------------------------

  } // Program

} // ns