Matrix Pseudo-Inverse Using Transpose and Cholesky Inverse with C#

In ordinary arithmetic, the inverse of 5.0 is 0.20 because 5.0 * 0.20 = 1.0. In matrix algebra, if you have a matrix A, it has an inverse Ainv if A * Ainv = I, where * is matrix multiplication and I is the identity matrix (1.0s on the diagonal, 0.0s off the diagonal).

A matrix inverse exists only if the source matrix A is square (same number of rows and columns). But in machine learning, there are scenarios where you want to compute the inverse of a non-square matrix. The most common technique is called the Moore-Penrose pseudo-inverse.

The standard way to compute the pseudo-inverse uses singular value decomposition (SVD), an extremely complicated algorithm. I decided to explore an alternative technique to compute the pseudo-inverse which uses matrix inverse using Cholesky decomposition — still extremely complicated but significantly easier than SVD. The technique is usually called pseudo-inverse via normal equations.

If the source matrix is A, the pseudo-inverse can be computed as inv(At * A) * At where At s the transpose of A, inv() is standard matrix inverse, and * is matrix multiplication. This technique works because At * A is a square matrix and so standard matrix inverse can be applied.

The inv() part can be any one of the over a dozen matrix inverse algorithms, each of which has many variations. But At * A has a special form called symmetric positive-definite (if a small constant is added to the diagonal elements of At * A). For such a matrix, an algorithm called Cholesky decomposition can be used to compute the inverse. There are different ways to compute Cholesky decomposition; I prefer the Banachiewicz algorithm.

To recap, one way to compute the pseudo-inverse of a matrix is called the Moore-Penrose pseudo-inverse, which is usually computed using the blisteringly complex SVD algorithm. Another was to compute a matrix pseudo-inverse of a matrix A is called pseudo-inverse via normal equations, and it can be computed as inv(At * A) * At where the inv() can be Cholesky inverse.

The disadvantage of of the simpler normal equations approach is that At * A might be a huge matrix, in which case inv(At * A) might be too difficult.

I put together a demo using the C# language. The output of my demo is:

Begin pseudo-inverse using Cholesky

Source matrix:
  1.0  4.0  2.0
  6.0  0.0  3.0
  7.0  2.0  1.0
  5.0  9.0  8.0

Moore-Penrose pseudo-inverse:
   0.0047   0.0370   0.1331  -0.0317
   0.1306  -0.1946   0.1310   0.0239
  -0.1158   0.2113  -0.2393   0.1046

I verified the result using the Python language NumPy library linalg.pinv() function.

The pseudo-inverse function code is deceptively simple looking because there are lots of helper functions, and the MatInverseCholesky() funtion is especially complicated.

static double[][] MatPseudoInvCholesky(double[][] A)
{
  // inv(At * A) * At
  double[][] At = MatTranspose(A);
  double[][] AtA = MatProduct(At, A);  // is nxn
  for (int i = 0; i "lt" AtA.Length; ++i) // less-than
    AtA[i][i] += 1.0e-8; // condition the diagonal
  double[][] AtAinv = MatInverseCholesky(AtA);
  double[][] result = MatProduct(AtAinv, At);
  return result;

  // six nested helper functions go here
}

Good fun.



I wrote this blog post in December 2025, and then stashed it away until posting it. December 2025 was the 100th anniversary of one of the most famous chess tournaments in history — Moscow 1925.

The tournament had 21 players including many of the most famous of all time who essentially created modern chess: current world champion JR Capablanca, former champion Emanuel Lasker, US champion Frank J Marshall, Richard Reti, Ernst Gruenfeld, Rudolf Spielmann, Akiba Rubinstein, Friedrich Samish. The only top player missing was future world champion Alexander Alekhine, who fled the Soviet Union in 1921 and was a persona non grata there.

The tournament was won by Soviet master Efim Bogoljubov who was at the peak of his powers. Bogoljubov played two matches for the World Chess Championship, losing to Alexander Alekhine in 1929 and 1934.

A lot of the men are holding canes — in the 1920s, canes were both a fashion statement and an effective means of self-defense.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols (my blog editor often chokes on symbols).

using System;
using System.IO;
using System.Collections.Generic;

// Moore-Penrose pseudo-inverse using inv(At * A) * At
// where inv() is mattrix inverse using Cholesky decomp

namespace MatrixPseudoInverseClosedCholesky
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin pseudo-inverse using " +
        "Cholesky ");

      double[][] A = new double[4][];
      A[0] = new double[] { 1.0, 4.0, 2.0 };
      A[1] = new double[] { 6.0, 0.0, 3.0 };
      A[2] = new double[] { 7.0, 2.0, 1.0 };
      A[3] = new double[] { 5.0, 9.0, 8.0 };

      Console.WriteLine("\nSource matrix: ");
      MatShow(A, 1, 5);

      double[][] Apinv = MatPseudoInvCholesky(A);
      Console.WriteLine("\nMoore-Penrose pseudo-inverse: ");
      MatShow(Apinv, 4, 9);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    } // Main()

    static void MatShow(double[][] m, int dec, int wid)
    {
      for (int i = 0; i "lt" m.Length; ++i)
      {
        for (int j = 0; j "lt" m[0].Length; ++j)
        {
          double v = m[i][j];
          Console.Write(v.ToString("F" + dec).PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    static double[][] MatPseudoInvCholesky(double[][] A)
    {
      // inv(At * A) * At
      double[][] At = MatTranspose(A);
      double[][] AtA = MatProduct(At, A);  // is nxn
      for (int i = 0; i "lt" AtA.Length; ++i)
        AtA[i][i] += 1.0e-8; // condition the diagonal
      double[][] AtAinv = MatInverseCholesky(AtA);
      double[][] result = MatProduct(AtAinv, At);
      return result;

      // nested helpers: MatMake(), MatTranspose(),
      // MatProduct(), MatDecompCholesky(),
      // MatInverseCholesky(), MatIdentity()

      static double[][] MatMake(int nr, int nc)
      {
        double[][] result = new double[nr][];
        for (int i = 0; i "lt" nr; ++i)
          result[i] = new double[nc];
        return result;
      }

      static double[][] MatTranspose(double[][] M)
      {
        int nr = M.Length; int nc = M[0].Length;
        double[][] result = MatMake(nc, nr);  // reversed
        for (int i = 0; i "lt" nr; ++i)
          for (int j = 0; j "lt" nc; ++j)
            result[j][i] = M[i][j];
        return result;
      }

      static double[][] MatProduct(double[][] A,
        double[][] B)
      {
        int aRows = A.Length;
        int aCols = A[0].Length;
        int bRows = B.Length;
        int bCols = B[0].Length;
        if (aCols != bRows)
          throw new Exception("Non-conformable matrices");

        double[][] result = MatMake(aRows, bCols);

        for (int i = 0; i "lt" aRows; ++i) // each row of A
          for (int j = 0; j "lt" bCols; ++j) // each col of B
            for (int k = 0; k "lt" aCols; ++k)
              result[i][j] += A[i][k] * B[k][j];

        return result;
      }

      static double[][] MatDecompCholesky(double[][] M)
      {
        // Cholesky decomposition (Banachiewicz algorithm)
        // M is nxn square, symmetric, positive definite
        int n = M.Length;
        double[][] result = MatMake(n, n);  // all 0.0
        for (int i = 0; i "lt" n; ++i)
        {
          for (int j = 0; j "lte" i; ++j)
          {
            double sum = 0.0;
            for (int k = 0; k "lt" j; ++k)
              sum += result[i][k] * result[j][k];
            if (i == j)
            {
              double tmp = M[i][i] - sum;
              if (tmp "lt" 0.0)
                throw new
                  Exception("MatDecompCholesky fatal");
              result[i][j] = Math.Sqrt(tmp);
            }
            else
            {
              if (result[j][j] == 0.0)
                throw new
                  Exception("MatDecompCholesky fatal ");
              result[i][j] =
                (1.0 / result[j][j] * (M[i][j] - sum));
            }
          } // j
        } // i
        return result;
      } // MatDecompCholesky

      static double[][] MatInverseCholesky(double[][] M)
      {
        double[][] L = MatDecompCholesky(M);  // lower tri
        int n = L.Length;
        double[][] result = MatIdentity(n);

        for (int k = 0; k "lt" n; ++k)
        {
          for (int j = 0; j "lt" n; j++)
          {
            for (int i = 0; i "lt" k; i++)
            {
              result[k][j] -= result[i][j] * L[k][i];
            }
            result[k][j] /= L[k][k];
          }
        }

        for (int k = n - 1; k "gte" 0; --k)
        {
          for (int j = 0; j "lt" n; j++)
          {
            for (int i = k + 1; i "lt" n; i++)
            {
              result[k][j] -= result[i][j] * L[i][k];
            }
            result[k][j] /= L[k][k];
          }
        }
        return result;
      }

      static double[][] MatIdentity(int n)
      {
        double[][] result = MatMake(n, n);
        for (int i = 0; i "lt" n; ++i)
          result[i][i] = 1.0;
        return result;
      }

    } // MatPseudoInvCholesky

  } // class Program

} // ns
This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply