“Linear Regression with Pseudo-Inverse Training Using JavaScript” in Visual Studio Magazine

I wrote an article titled “Linear Regression with Pseudo-Inverse Training Using JavaScript” in the February 2026 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2026/02/02/linear-regression-with-pseudo-inverse-training-using-javascript.aspx.

The goal of a machine learning regression problem is to predict a single numeric value. For example, you might want to predict the bank account balance of an employee based on annual income, age, and years of work experience. There are roughly a dozen main regression techniques, such as nearest neighbors regression, kernel ridge regression, neural network regression, and gradient boosting regression. Linear regression is the most fundamental technique.

The form of a linear regression prediction equation is y’ = (w0 * x0) + (w1 * x1) + . . + (wn * xn) + b where y’ is the predicted value, the xi are predictor values, the wi are constants called model weights, and b is a constant called the bias. For example, y’ = predicted balance = (-0.54 * income) + (-0.38 * ago) + (0.17 * experience) + 0.62. Training the model is the process of finding the values of the weights and bias so that predicted y values are close to known correct target y values in a set of training data.

There are many different techniques to train a linear regression model. Three of the most common are 1.) stochastic gradient descent (SGD), 2.) relaxed Moore-Penrose pseudo-inverse, and 3.) pseudo-inverse via normal equations closed form training. My article explains how to implement the relaxed MP pseudo-inverse training technique.

The output of the article demo program is:

Linear regression with pseudo-inverse (QR-Householder)
 training using JavaScript.

Loading synthetic train (200) and test (40) data from file

First three train X:
 -0.1660   0.4406  -0.9998  -0.3953  -0.7065
  0.0776  -0.1616   0.3704  -0.5911   0.7562
 -0.9452   0.3409  -0.1654   0.1174  -0.7192

First three train y:
   0.4840
   0.1568
   0.8054

Creating and training model
Done

Model weights/coefficients:
 -0.2656   0.0333  -0.0454   0.0358  -0.1146
Model bias/intercept: 0.3619

Evaluating model

Train acc (within 0.10) = 0.4600
Test acc (within 0.10) = 0.6500

Train MSE = 0.0026
Test MSE = 0.0020

Predicting for x =
  -0.1660    0.4406   -0.9998   -0.3953   -0.7065
Predicted y = 0.5329

End demo

In theory, the weight values of a linear regression model can be solved using the equation w = inv(X) * y, where w is a vector of weights and the bias that you want to find, X is a “design matrix.” which is a matrix of training predictor data that has a leading column of 1.0 values prepended, inv() is the matrix inverse operation, * means matrix-to-vector multiplication, and y is a vector of the target y values. However, this equation won’t work because matrix inverse applies only to square matrices that have the same number of rows and columns, which is almost never the case in machine learning scenarios.

One work-around is to use what’s called the relaxed Moore-Penrose pseudo-inverse. The math equation is: w = pinv(X) * y. This technique works because the pseudo-inverse applies to any shape matrix. The technique is called relaxed because in a machine learning scenario, the pseudo-inverse implementation doesn’t have to fulfill all the mathematical requirements of true MP pseudo-inverse.

Implementing linear regression using relaxed MP pseudo-inverse training from scratch requires a bit of effort, but compared to using a library function such as those in the Python language scikit-learn library, it allows you to easily integrate a prediction model with other Web-oriented systems. And a from-scratch implementation also allows you to easily modify the system. For example, you can add error checking that is relevant to your particular problem scenario, or you can add diagnostic console.log() statements to monitor training.



Linear regression is a classical machine learning technique. People in tubes is a classical science fiction movie technique.

Left: In “Lifeforce” (1985), a space crew discovers a huge derelict spacecraft that contains bodies in angular tubes. The crew brings the bodies back to Earth. Bad idea. Space vampires. Pretty good movie. I give it my personal B grade.

Center: In “Osiris” (2025), a team of U.S. Special Forces members is mysteriously abducted. They awaken in some pods on an alien spaceship. The aliens are bad. I give this movie a C+ grade.

Right: In “The Creation of the Humanoids” (1962), in the 23rd century, after a nuclear war, society has become dependent on robots. There are extremist human groups and extremist robot groups. The movie is slow and chatty. Somedays I like the movie (grade B) and somedays not (grade C-).


Posted in JavaScript, Machine Learning | Leave a comment

NFL 2025 Season Week 24 – Zoltar Predictions Recap

Zoltar is my NFL football prediction computer program. It uses a neural network and a type of reinforcement learning. The NFL 2025 season ended on Sunday, February 8, 2026 when the Seattle Seahawks beat the New England Patriots by a score of 29-13 in Super Bowl LX (60).

Zoltar had a good, but not great, season. For the 272 NFL games played during the regular season (i.e., not including the 12 post-season games and the Super Bowl), Zoltar predictions were 46-30 (~60% accuracy) against the Las Vegas point spread. Including all 13 post-season games, Zoltar was 50-30 (~62% accuracy) against the spread.



This is what Zoltar looks like (2025 week #18 predictions)


Zoltar theoretically suggests betting when the Vegas line is “significantly” different from Zoltar’s prediction. For the beginning and end of the season, I use a relatively conservative threshold of 4.0 points difference. For the middle of the season, I use a relatively aggressive threshold of 3.0 points difference.

Theoretically, if you must bet $110 to win $100 (typical in Vegas) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better to take into account logistics and things like manual data entry errors.

Before the season started, I asked Zoltar to predict the number of wins each team would reach (out of the 17 regular season games). I figured I’d check those predictions against the actual number of wins by each of the 32 NFL teams:

              Predicted  Actual
      Team       Wins     Wins   Diff
---------------------------------------------------
      bears       5       11      +6
    bengals       8        6      -2
      bills      12       12       0
    broncos       9       14      +5

     browns       3        5      +2
 buccaneers      10        8      -2
  cardinals       7        3      -4
   chargers      11       11       0

     chiefs      13        6      -7
      colts       7        8      +1
    cowboys       7        7       0
   dolphins       7        7       0

     eagles      12       11      -1
    falcons       7        8      +1
fortyniners       7       12      +5
     giants       3        4      +1

    jaguars       5       13      +8
       jets       6        3      -3
      lions      13        9      -4
    packers      10        9      -1

   panthers       5        8      +3
   patriots       5       14      +9
    raiders       5        3      -2
       rams      10       12      +2

     ravens      11        9      -2
 commanders      11        5      -6
     saints       5        6      +1
   seahawks      10       14      +4

   steelers       9       10      +1
     texans       9       12      +3
     titans       3        3       0
    vikings      12        9      -3

Not too bad. Zoltar was correct within plus-or-minus 2 games for 18 of the 32 teams.

The two biggest positive surprises to Zoltar were the Jaguars (+8 wins) and Patriots (+9 wins).

The biggest negative surprises were the Chiefs (-7 wins) and Commanders (-6 wins).

Well, Zoltar is going into hibernation for the next seven months, until the 2026 season starts in September.



Left: My prediction system is named after the Zoltar fortune teller machine you can find in arcades. Zoltar machines often make appearances at conferences too.

Center and Right: I got a very cool mini-Zoltar machine for Christmas this year.


Posted in Zoltar | Leave a comment

Matrix Pseudo-Inverse Using Transpose and Cholesky Inverse with C#

In ordinary arithmetic, the inverse of 5.0 is 0.20 because 5.0 * 0.20 = 1.0. In matrix algebra, if you have a matrix A, it has an inverse Ainv if A * Ainv = I, where * is matrix multiplication and I is the identity matrix (1.0s on the diagonal, 0.0s off the diagonal).

A matrix inverse exists only if the source matrix A is square (same number of rows and columns). But in machine learning, there are scenarios where you want to compute the inverse of a non-square matrix. The most common technique is called the Moore-Penrose pseudo-inverse.

The standard way to compute the pseudo-inverse uses singular value decomposition (SVD), an extremely complicated algorithm. I decided to explore an alternative technique to compute the pseudo-inverse which uses matrix inverse using Cholesky decomposition — still extremely complicated but significantly easier than SVD. The technique is usually called pseudo-inverse via normal equations.

If the source matrix is A, the pseudo-inverse can be computed as inv(At * A) * At where At s the transpose of A, inv() is standard matrix inverse, and * is matrix multiplication. This technique works because At * A is a square matrix and so standard matrix inverse can be applied.

The inv() part can be any one of the over a dozen matrix inverse algorithms, each of which has many variations. But At * A has a special form called symmetric positive-definite (if a small constant is added to the diagonal elements of At * A). For such a matrix, an algorithm called Cholesky decomposition can be used to compute the inverse. There are different ways to compute Cholesky decomposition; I prefer the Banachiewicz algorithm.

To recap, one way to compute the pseudo-inverse of a matrix is called the Moore-Penrose pseudo-inverse, which is usually computed using the blisteringly complex SVD algorithm. Another was to compute a matrix pseudo-inverse of a matrix A is called pseudo-inverse via normal equations, and it can be computed as inv(At * A) * At where the inv() can be Cholesky inverse.

The disadvantage of of the simpler normal equations approach is that At * A might be a huge matrix, in which case inv(At * A) might be too difficult.

I put together a demo using the C# language. The output of my demo is:

Begin pseudo-inverse using Cholesky

Source matrix:
  1.0  4.0  2.0
  6.0  0.0  3.0
  7.0  2.0  1.0
  5.0  9.0  8.0

Moore-Penrose pseudo-inverse:
   0.0047   0.0370   0.1331  -0.0317
   0.1306  -0.1946   0.1310   0.0239
  -0.1158   0.2113  -0.2393   0.1046

I verified the result using the Python language NumPy library linalg.pinv() function.

The pseudo-inverse function code is deceptively simple looking because there are lots of helper functions, and the MatInverseCholesky() funtion is especially complicated.

static double[][] MatPseudoInvCholesky(double[][] A)
{
  // inv(At * A) * At
  double[][] At = MatTranspose(A);
  double[][] AtA = MatProduct(At, A);  // is nxn
  for (int i = 0; i "lt" AtA.Length; ++i) // less-than
    AtA[i][i] += 1.0e-8; // condition the diagonal
  double[][] AtAinv = MatInverseCholesky(AtA);
  double[][] result = MatProduct(AtAinv, At);
  return result;

  // six nested helper functions go here
}

Good fun.



I wrote this blog post in December 2025, and then stashed it away until posting it. December 2025 was the 100th anniversary of one of the most famous chess tournaments in history — Moscow 1925.

The tournament had 21 players including many of the most famous of all time who essentially created modern chess: current world champion JR Capablanca, former champion Emanuel Lasker, US champion Frank J Marshall, Richard Reti, Ernst Gruenfeld, Rudolf Spielmann, Akiba Rubinstein, Friedrich Samish. The only top player missing was future world champion Alexander Alekhine, who fled the Soviet Union in 1921 and was a persona non grata there.

The tournament was won by Soviet master Efim Bogoljubov who was at the peak of his powers. Bogoljubov played two matches for the World Chess Championship, losing to Alexander Alekhine in 1929 and 1934.

A lot of the men are holding canes — in the 1920s, canes were both a fashion statement and an effective means of self-defense.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols (my blog editor often chokes on symbols).

using System;
using System.IO;
using System.Collections.Generic;

// Moore-Penrose pseudo-inverse using inv(At * A) * At
// where inv() is mattrix inverse using Cholesky decomp

namespace MatrixPseudoInverseClosedCholesky
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin pseudo-inverse using " +
        "Cholesky ");

      double[][] A = new double[4][];
      A[0] = new double[] { 1.0, 4.0, 2.0 };
      A[1] = new double[] { 6.0, 0.0, 3.0 };
      A[2] = new double[] { 7.0, 2.0, 1.0 };
      A[3] = new double[] { 5.0, 9.0, 8.0 };

      Console.WriteLine("\nSource matrix: ");
      MatShow(A, 1, 5);

      double[][] Apinv = MatPseudoInvCholesky(A);
      Console.WriteLine("\nMoore-Penrose pseudo-inverse: ");
      MatShow(Apinv, 4, 9);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    } // Main()

    static void MatShow(double[][] m, int dec, int wid)
    {
      for (int i = 0; i "lt" m.Length; ++i)
      {
        for (int j = 0; j "lt" m[0].Length; ++j)
        {
          double v = m[i][j];
          Console.Write(v.ToString("F" + dec).PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    static double[][] MatPseudoInvCholesky(double[][] A)
    {
      // inv(At * A) * At
      double[][] At = MatTranspose(A);
      double[][] AtA = MatProduct(At, A);  // is nxn
      for (int i = 0; i "lt" AtA.Length; ++i)
        AtA[i][i] += 1.0e-8; // condition the diagonal
      double[][] AtAinv = MatInverseCholesky(AtA);
      double[][] result = MatProduct(AtAinv, At);
      return result;

      // nested helpers: MatMake(), MatTranspose(),
      // MatProduct(), MatDecompCholesky(),
      // MatInverseCholesky(), MatIdentity()

      static double[][] MatMake(int nr, int nc)
      {
        double[][] result = new double[nr][];
        for (int i = 0; i "lt" nr; ++i)
          result[i] = new double[nc];
        return result;
      }

      static double[][] MatTranspose(double[][] M)
      {
        int nr = M.Length; int nc = M[0].Length;
        double[][] result = MatMake(nc, nr);  // reversed
        for (int i = 0; i "lt" nr; ++i)
          for (int j = 0; j "lt" nc; ++j)
            result[j][i] = M[i][j];
        return result;
      }

      static double[][] MatProduct(double[][] A,
        double[][] B)
      {
        int aRows = A.Length;
        int aCols = A[0].Length;
        int bRows = B.Length;
        int bCols = B[0].Length;
        if (aCols != bRows)
          throw new Exception("Non-conformable matrices");

        double[][] result = MatMake(aRows, bCols);

        for (int i = 0; i "lt" aRows; ++i) // each row of A
          for (int j = 0; j "lt" bCols; ++j) // each col of B
            for (int k = 0; k "lt" aCols; ++k)
              result[i][j] += A[i][k] * B[k][j];

        return result;
      }

      static double[][] MatDecompCholesky(double[][] M)
      {
        // Cholesky decomposition (Banachiewicz algorithm)
        // M is nxn square, symmetric, positive definite
        int n = M.Length;
        double[][] result = MatMake(n, n);  // all 0.0
        for (int i = 0; i "lt" n; ++i)
        {
          for (int j = 0; j "lte" i; ++j)
          {
            double sum = 0.0;
            for (int k = 0; k "lt" j; ++k)
              sum += result[i][k] * result[j][k];
            if (i == j)
            {
              double tmp = M[i][i] - sum;
              if (tmp "lt" 0.0)
                throw new
                  Exception("MatDecompCholesky fatal");
              result[i][j] = Math.Sqrt(tmp);
            }
            else
            {
              if (result[j][j] == 0.0)
                throw new
                  Exception("MatDecompCholesky fatal ");
              result[i][j] =
                (1.0 / result[j][j] * (M[i][j] - sum));
            }
          } // j
        } // i
        return result;
      } // MatDecompCholesky

      static double[][] MatInverseCholesky(double[][] M)
      {
        double[][] L = MatDecompCholesky(M);  // lower tri
        int n = L.Length;
        double[][] result = MatIdentity(n);

        for (int k = 0; k "lt" n; ++k)
        {
          for (int j = 0; j "lt" n; j++)
          {
            for (int i = 0; i "lt" k; i++)
            {
              result[k][j] -= result[i][j] * L[k][i];
            }
            result[k][j] /= L[k][k];
          }
        }

        for (int k = n - 1; k "gte" 0; --k)
        {
          for (int j = 0; j "lt" n; j++)
          {
            for (int i = k + 1; i "lt" n; i++)
            {
              result[k][j] -= result[i][j] * L[i][k];
            }
            result[k][j] /= L[k][k];
          }
        }
        return result;
      }

      static double[][] MatIdentity(int n)
      {
        double[][] result = MatMake(n, n);
        for (int i = 0; i "lt" n; ++i)
          result[i][i] = 1.0;
        return result;
      }

    } // MatPseudoInvCholesky

  } // class Program

} // ns
Posted in Machine Learning | Leave a comment

A Relatively Simple Approximation to Singular Value Decomposition Using QR Decomposition With C#

Before I start: The technique presented in this post is only an investigation to compute SVD. The algorithm presented here is (possibly) not numerically stable, and is (probably) not suitable when high accuracy (beyond 8 decimals) is needed. And the algorithm is very slow compared to standard SVD algorithms. That said, for small matrices, the technique for SVD presented here is the simplest I’ve seen (but by no means simple) and appears to work well.

The problem is singular value decomposition (SVD). If you have a matrix A and decompose it using SVD, you get a matrix U, a vector s, and a matrix Vh, where A = U * S * Vh. The S is a square matrix with the s values on the diagonal. SVD has many important uses in machine learning, for example, computing a pseudo-inverse for training a linear model.

Unfortunately, accurately computing SVD is incredibly difficult. Many mathematicians worked for many years on SVD and there are dozens of solutions, each with pros and cons. Two common SVD algorithms are Jacobi and Golub-Kahan-Reinsch. Both algorithms are very complex. But it is possible to compute an approximation to SVD. Briefly, to approximate the SVD of a source matrix A, first compute the QR decomposition of A to get Q and R, and then compute the SVD of the R matrix.

This seems silly at first — using SVD to compute SVD — but the R result matrix of a QR decomposition is a square matrix, and it’s possible to find its SVD using QR in a relatively simple way . In high-level pseudo-code:

def my_svd(A):
  Q, R = my_qr(A, mode='reduced') # apply QR decomp
  U, s, Vh = my_svd_square(R) # apply QR decomp again
  U = np.matmul(Q, U)
  return U, s, Vh

The SVD function described above isn’t trivial, but it’s vastly less complicated than computing SVD using Jacobi or any other algorithm. The my_svd_square() helper function for a square matrix is also much simpler than computing SVD for a non-square matrix. In essence, the idea for approximating SVD is to break it down into two much simpler (but by no means easy) sub-problems that can be solved using QR decomposition.

A few days ago, I implemented the idea using Python. Just for fun, one rainy morning before work, I decided to refactor the demo code to the C# language. The output of a demo run:

Begin SVD using QR demo with C#

Source matrix A:
   1.0000   6.0000   8.0000   5.0000
   0.0000   2.0000   4.0000   6.0000
   9.0000   3.0000   7.0000   7.0000
  -1.0000  -3.0000  -5.0000  -7.0000
   7.0000   5.0000   3.0000   1.0000

Computing SVD using QR
Exited at iteration 20

U =
   0.490569   0.357946  -0.658897  -0.443928
   0.309393   0.418534   0.228973   0.339519
   0.618658  -0.402788   0.526177  -0.422092
  -0.402797  -0.389987  -0.198233  -0.465342
   0.344434  -0.618365  -0.444148   0.541249

s =
  21.095067   8.254694   4.718203   1.611445

Vh =
   0.420588   0.395767   0.594453   0.559554
  -0.872923  -0.017625   0.219636   0.435263
   0.247105  -0.750910  -0.214769   0.573539
   0.007029   0.528386  -0.743142   0.410485

Checking U * S * Vh:
   1.0000   6.0000   8.0000   5.0000
   0.0000   2.0000   4.0000   6.0000
   9.0000   3.0000   7.0000   7.0000
  -1.0000  -3.0000  -5.0000  -7.0000
   7.0000   5.0000   3.0000   1.0000

End demo

I verified my results using the NumPy library np.linalg.svd() function and got the same results — essentially. One of the hugely annoying details with SVD is that the sign of the values in the columns of U, and in the rows of Vh are arbitrary. Singular values by convention are always positive or zero.

Well, all in all, this was a very challenging and very interesting exploration.



The history of machine learning is filled with incremental innovations. The history of pinball machines is filled with incremental innovations. A major advance occurred in 1953 when “Army Navy” by Williams introduced the first automatic reel scoring. Before this, fixed scores (1000 2000 etc.) were printed on the back glass and were illuminated as a player reached a point threshold. Somewhat surprisingly, the reel scoring feature did not catch on until a few years later when a high score reel was added.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols — my blog editor chokes on symbols.

using System;
using System.IO;
using System.Collections.Generic;

namespace Matrix_SVD_Using_QR
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin SVD using QR" +
        " demo with C# ");

      double[][] A = new double[5][];
      A[0] = new double[] { 1, 6, 8, 5 };
      A[1] = new double[] { 0, 2, 4, 6 };
      A[2] = new double[] { 9, 3, 7, 7 };
      A[3] = new double[] { -1, -3, -5, -7 };
      A[4] = new double[] { 7, 5, 3, 1 };

      Console.WriteLine("\nSource matrix A: ");
      MatShow(A, 4, 9);

      Console.WriteLine("\nComputing SVD using QR ");
      double[][] U; double[] s; double[][] Vh;
      MatSVD(A, out U, out s, out Vh);
      
      Console.WriteLine("\nU = ");
      MatShow(U, 6, 11);

      Console.WriteLine("\ns = ");
      VecShow(s, 6, 11);

      Console.WriteLine("\nVh = ");
      MatShow(Vh, 6, 11);

      Console.WriteLine("\nChecking U * S * Vh: ");
      double[][] S = MatMake(s.Length, s.Length);
      for (int i = 0; i "lt" s.Length; ++i)
        S[i][i] = s[i];
      double[][] C = MatProduct(U, MatProduct(S, Vh));
      MatShow(C, 4, 9);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    }

    // ------------------------------------------------------

    private static void MatSVD(double[][] A,
      out double[][] U, out double[] s, out double[][] Vh)
    {
      // an approximation for SVD using QR
      // not practical for anything other than small matrices
      double[][] Q; double[][] R;
      MatDecomposeQR(A, out Q, out R, reduced:true);
      double[][] UU; double[] ss; double[][] VVh; // working
      MatSVD_Square(R, out UU, out ss, out VVh);
      UU = MatProduct(Q, UU);
      U = UU; s = ss; Vh = VVh;
    }

    // ------------------------------------------------------

    private static void MatSVD_Square(double[][] A,
      out double[][] U, out double[] s, out double[][] Vh)
    {
      // SVD for a square matrix using iterative QR
      int m = A.Length; int n = A[0].Length;
      if (m != n)
        throw new Exception("MatSVD_Square accepts" +
          " only square matrix ");

      int maxIter = 1000;
      double[][] R1 = MatCopy(A);
      double[][] UU = MatIdentity(m);
      double[][] VV = MatIdentity(m);

      double[][] Q1; double[][] Q2; double[][] R2;
      for (int i = 0; i "lt" maxIter; ++i)
      {
        MatDecomposeQR(R1, out Q1, out R1, reduced:true);
        MatDecomposeQR(MatTranspose(R1),
          out Q2, out R2, reduced:true);
        R1 = MatTranspose(R2);
        UU = MatProduct(UU, Q1);
        VV = MatProduct(VV, Q2);

        // exit when off-diag cells of R1 are all close to 0
        if (i % 10 == 0)
        {
          if (MatOffDiagAllZero(R1, 1.0e-8) == true)
          {
            Console.WriteLine("Exited at iteration " + i);
            break;
          }
        }
      } // i
      
      // deal with negative singular values (in R1)
      double[] ss = new double[m];
      for (int j = 0; j "lt" m; ++j)
        ss[j] = R1[j][j];
      double[][] VVh = MatTranspose(VV);
      for (int i = 0; i "lt" m; ++i)
      {
        if (ss[i] "lt" 0.0)
        {
          ss[i] = -ss[i]; // sing vals must be non-neg
          // adjust row of VVh
          for (int j = 0; j "lt" m; ++j)
            VVh[i][j] = -VVh[i][j];
        }
      }

      U = UU; s = ss; Vh = VVh;
    } // MatSVD_Square()

    // ------------------------------------------------------

    private static bool MatOffDiagAllZero(double[][] A,
      double tol)
    {
      int nr = A.Length; int nc = A[0].Length;
      for (int i = 0; i "lt" nr; ++i)
      {
        for (int j = 0; j "lt" nc; ++j)
        {
          if (i == j) continue; // no check diag
          if (Math.Abs(A[i][j]) "gt" tol)
            return false;
        }
      }
      return true;
    }

    // ------------------------------------------------------

    private static void MatDecomposeQR(double[][] M,
      out double[][] Q, out double[][] R, bool reduced)
    {
      // QR decomposition, Householder algorithm.
      // see rosettacode.org/wiki/QR_decomposition
      int m = M.Length;
      int n = M[0].Length;

      if (m "lt" n)
        throw new Exception("No rows less than cols");

      double[][] QQ = MatIdentity(m); // working Q
      double[][] RR = MatCopy(M); // working R

      int end;
      if (m == n) end = n - 1;
      else end = n;

      for (int i = 0; i "lt" end; ++i)
      {
        double[][] H = MatIdentity(m);
        double[] a = new double[m - i]; // corr
        int k = 0;
        for (int ii = i; ii "lt" m; ++ii) // corr
          a[k++] = RR[ii][i];

        double normA = VecNorm(a);
        if (a[0] "lt" 0.0 && normA "gt" 0.0) // corr
          normA = -normA;
        else if (a[0] "gt" 0.0 && normA "lt" 0.0)
          normA = -normA;

        double[] v = new double[a.Length];
        for (int j = 0; j "lt" v.Length; ++j)
          v[j] = a[j] / (a[0] + normA);
        v[0] = 1.0;

        // Householder algorithm
        double[][] h = MatIdentity(a.Length);
        double vvDot = VecDot(v, v);
        double[][] A = VecToMat(v, v.Length, 1);
        double[][] B = VecToMat(v, 1, v.Length);
        double[][] AB = MatProduct(A, B);

        for (int ii = 0; ii "lt" h.Length; ++ii)
          for (int jj = 0; jj "lt" h[0].Length; ++jj)
            h[ii][jj] -= (2.0 / vvDot) * AB[ii][jj];

        // copy h[][] into lower right corner of H[][]
        int d = m - h.Length; // corr
        for (int ii = 0; ii "lt" h.Length; ++ii)
          for (int jj = 0; jj "lt" h[0].Length; ++jj)
            H[ii + d][jj + d] = h[ii][jj];

        QQ = MatProduct(QQ, H);
        RR = MatProduct(H, RR);
      } // i

      if (reduced == false)
      {
        Q = QQ; // working results into the out params
        R = RR;
        return;
      }
      //else if (reduced == true)
      {
        int qRows = QQ.Length; int qCols = QQ[0].Length;
        int rRows = RR.Length; int rCols = RR[0].Length;
        // assumes m "gte" n !!

        // square-up R
        int dim = Math.Min(rRows, rCols);
        double[][] Rsquared = MatMake(dim, dim);
        for (int i = 0; i "lt" dim; ++i)
          for (int j = 0; j "lt" dim; ++j)
            Rsquared[i][j] = RR[i][j];

        // Q needs same number columns as R
        // so that inv(R) * trans(Q) works
        double[][] Qtrimmed = MatMake(qRows, dim);
        for (int i = 0; i "lt" qRows; ++i)
          for (int j = 0; j "lt" dim; ++j)
            Qtrimmed[i][j] = QQ[i][j];

        Q = Qtrimmed;
        R = Rsquared;
        return;
      }
    } // MatDecomposeQR()

     // ------------------------------------------------------

    private static double[][] MatCopy(double[][] M)
    {
      int nr = M.Length; int nc = M[0].Length;
      double[][] result = MatMake(nr, nc);
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[i][j] = M[i][j];
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatTranspose(double[][] m)
    {
      int nr = m.Length;
      int nc = m[0].Length;
      double[][] result = MatMake(nc, nr);  // note
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[j][i] = m[i][j];
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatMake(int nRows, int nCols)
    {
      double[][] result = new double[nRows][];
      for (int i = 0; i "lt" nRows; ++i)
        result[i] = new double[nCols];
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatIdentity(int n)
    {
      double[][] result = MatMake(n, n);
      for (int i = 0; i "lt" n; ++i)
        result[i][i] = 1.0;
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatProduct(double[][] matA,
      double[][] matB)
    {
      int aRows = matA.Length;
      int aCols = matA[0].Length;
      int bRows = matB.Length;
      int bCols = matB[0].Length;
      if (aCols != bRows)
        throw new Exception("Non-conformable matrices");

      double[][] result = MatMake(aRows, bCols);

      for (int i = 0; i "lt" aRows; ++i) // each row of A
        for (int j = 0; j "lt" bCols; ++j) // each col of B
          for (int k = 0; k "lt" aCols; ++k)
            result[i][j] += matA[i][k] * matB[k][j];

      return result;
    }

    // ------------------------------------------------------

    private static double VecNorm(double[] vec)
    {
      int n = vec.Length;
      double sum = 0.0;
      for (int i = 0; i "lt" n; ++i)
        sum += vec[i] * vec[i];
      return Math.Sqrt(sum);
    }

    // ------------------------------------------------------

    private static double[][] VecToMat(double[] vec,
        int nRows, int nCols)
    {
      double[][] result = MatMake(nRows, nCols);
      int k = 0;
      for (int i = 0; i "lt" nRows; ++i)
        for (int j = 0; j "lt" nCols; ++j)
          result[i][j] = vec[k++];
      return result;
    }

    // ------------------------------------------------------

    private static double VecDot(double[] v1, double[] v2)
    {
      double result = 0.0;
      int n = v1.Length;
      for (int i = 0; i "lt" n; ++i)
        result += v1[i] * v2[i];
      return result;
    }

    // ------------------------------------------------------

    private static void MatShow(double[][] M, int dec,
      int wid)
    {
      for (int i = 0; i "lt" M.Length; ++i)
      {
        for (int j = 0; j "lt" M[0].Length; ++j)
        {
          double v = M[i][j];
          Console.Write(v.ToString("F" + dec).
            PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    // ------------------------------------------------------

    private static void VecShow(double[] vec,
      int dec, int wid)
    {
      for (int i = 0; i "lt" vec.Length; ++i)
        Console.Write(vec[i].ToString("F" + dec).
          PadLeft(wid));
      Console.WriteLine("");
    }

  } // Program
} // ns
Posted in Machine Learning | Leave a comment

Example of Kernel Ridge Regression Using the scikit-learn Library

I was working on a project where I was implementing kernel ridge regression (KRR) from scratch, using the C# language. KRR is tricky so I figured I’d run my demo data through the scikit-learn KernelRidge module so I could verify my C# implementation results.

KRR is difficult to explain. KRR uses a kernel function that compares two vectors and gives a measure of similarity that’s between 0.0 (no similarity and 1.0 (vectors are the same). The most common kernel function is the radial basis function (RBF). The RBF kernel requires a value for a parameter gamma (and confusingly, there’s a different version of RBF that requires a value for a parameter sigma). All KRR models, regardless of the kernel function used, require a small alpha constant value that deters overfitting. The kernel gamma and model alpha must be determined by trial and error.

A KRR model has one weight per training item. My demo data has 200 training items so there are 200 weights. To make a prediction for an input vector x, KRR computes a weighted sum of the kernel function applied to x and every training item.

My demo data is synthetic. It looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
. . .

The first five values on each line are predictors. The last value on each line is the target y value to predict. There are 200 training items and 40 test items.

The output of the scikit KRR demo is:

Begin kernel ridge regression using scikit

Loading train (200) and test (40) data

Creating KRR model RBF gamma = 0.3, alpha = 0.005
Done

Model weights:
[-2.0218 -1.1406  0.0758 -0.6265  0.5722 -0.9905  0.6912
  0.4807  0.6496 -0.7364  1.2163 -0.5387  1.4276  0.0517
  1.0412  0.3933  0.2223  0.0564  0.4282  0.2780  0.9860
 -0.5630 -1.4695  0.1453 -1.2380  0.0059 -1.3257 -0.8750
  0.3530 -0.3127  0.7831 -0.0176  3.1105 -0.3441 -0.1874
 -0.9381 -1.0774 -1.5558 -1.6556  0.6567 -0.8531  1.4415
 -1.8095 -1.7655  0.3510 -0.7398 -1.6442  0.3615 -1.6873
  0.5667 -0.8786 -0.1819 -0.1024 -0.5156  0.5875 -0.1200
  1.7779 -0.7825 -1.5794 -0.0976  0.7271 -0.1227 -0.7527
  0.2934 -0.5738 -1.2267 -1.0484  0.7964 -0.3300 -1.5897
 -0.3434 -0.5869  0.5613  1.8904 -0.1202  1.1940 -1.0744
  0.0246 -0.5136 -0.2046  0.1461  1.7474  0.6251  0.8854
  0.9305  1.7007 -0.0030 -1.8766  1.1218 -1.6448  1.2075
  2.8511  3.5556 -1.0008 -1.3832  0.0897 -0.8997 -0.4007
 -0.5953 -0.5032  0.8405 -1.5877  0.6787 -0.5144  0.0824
 -0.0346 -0.6607  0.0076  0.0406  3.6744 -0.1402 -0.5089
 -1.4806  0.3451 -0.2460 -0.9033 -0.1536  0.7088  0.2355
  0.3538 -0.8437  0.1532  0.8644  1.9891  2.3591 -0.8606
 -0.8545  1.2116  0.5442 -0.3250 -0.4847 -0.6029 -0.0872
  1.5692 -0.2556 -1.4743  1.0171  3.7093  0.4378 -0.8627
  1.8371  0.3855  0.0130 -2.0406  0.9217  0.1018 -2.5532
 -1.0969  1.2241  0.2609 -0.1100 -0.1464  1.6083 -1.2871
  0.8604  0.5754 -0.3649  1.2258  0.4378 -2.9902  0.0453
  0.3650 -0.5867 -1.8827  1.3220  0.6443 -0.6158  0.7165
 -0.1986  1.5692  1.1419  0.6714  0.4306 -0.3582  1.1262
 -0.1465  0.3204 -0.3066 -0.3255  1.9478  2.0118 -1.3593
 -0.7194  2.7442 -0.4100 -1.3653 -2.0545 -0.8568 -0.7523
  1.9512  0.3181 -0.1053 -0.8341  0.8603  0.4775 -0.2014
 -1.6270 -0.5825 -0.0487  1.2897]

Accuracy (withhin 0.10) train = 0.9950
Accuracy (withhin 0.10) test = 0.9500

MSE train = 0.0000
MSE test = 0.0002

Predicting for x =
[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]
Predicted y = 0.4941

End demo

I implemented an accuracy() function that scores a prediction as correct if it’s within 10% of the true target value. The accuracy and MSE values for the KRR model are very good — among the best I’ve seen for the synthetic dataset I used.



Machine learning regression techniques are black-and-white in the sense that they’re objective. And I find great beauty in regression models. But I don’t understand subjective things like fashion photography. Here are three images from the Internet. They’re all the same model but I can’t evaluate if the photos are nice or not. By the way, I actually know the model who posed for these photos, and she knows quite a bit about machine learning.


Demo program. Replace “lt” (less than) in the accuracy() function with Boolean operator symbol.

# krr_scikit.py
# kernel ridge regression demo

import numpy as np
from sklearn.kernel_ridge import KernelRidge

np.set_printoptions(precision=4, suppress=True,
  floatmode='fixed', linewidth=60)

# -----------------------------------------------------------

def accuracy(model, data_X, data_y, pct_close):
  n = len(data_X)
  n_correct = 0; n_wrong = 0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)

    if np.abs(y - y_pred) "lt" np.abs(y * pct_close):
      n_correct += 1
    else: 
      n_wrong += 1
  # print("Correct = " + str(n_correct))
  # print("Wrong   = " + str(n_wrong))
  return n_correct / (n_correct + n_wrong)

def mse(model, data_X, data_y):
  n = len(data_X)
  sum = 0.0
  for i in range(n):
    actual_y = data_y[i]
    pred_y = model.predict(data_X[i].reshape(1, -1))
    diff = actual_y - pred_y
    sum += diff * diff
  return sum /n

# -----------------------------------------------------------

print("\nBegin kernel ridge regression using scikit ")

print("\nLoading train (200) and test (40) data ")
train_X = np.loadtxt(".\\Data\\synthetic_train_200.txt",
  usecols=[0,1,2,3,4], delimiter=",")
train_y = np.loadtxt(".\\Data\\synthetic_train_200.txt",
  usecols=5, delimiter=",")
test_X = np.loadtxt(".\\Data\\synthetic_test_40.txt",
  usecols=[0,1,2,3,4], delimiter=",")
test_y = np.loadtxt(".\\Data\\synthetic_test_40.txt",
  usecols=5, delimiter=",")

print("\nCreating KRR model RBF gamma = 0.3, alpha = 0.005 ")
krr = KernelRidge(alpha=0.005, kernel='rbf', gamma=0.3)
krr.fit(train_X, train_y)
print("Done ")

print("\nModel weights: ")
print(krr.dual_coef_)

acc_train = accuracy(krr, train_X, train_y, 0.10)
acc_test = accuracy(krr, test_X, test_y, 0.10)
print("\nAccuracy (withhin 0.10) train = %0.4f " % \
  acc_train)
print("Accuracy (withhin 0.10) test = %0.4f " % \
  acc_test)

mse_train = mse(krr, train_X, train_y)
mse_test = mse(krr, test_X, test_y)
print("\nMSE train = %0.4f " % mse_train)
print("MSE test = %0.4f " % mse_test)

x = train_X[0]
print("\nPredicting for x =")
print(x)
pred_y = krr.predict(x.reshape(1,-1))
print("Predicted y = %0.4f " % pred_y)

print("\nEnd demo ")

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386
Posted in Machine Learning, Scikit | Leave a comment

Gradient Boosting Regression from Scratch Using C# with List-Storage, No-Pointers, No-Recursion, MSE-Split Decision Trees

It’s not easy to explain the topic of this blog post. Bear with me.

The goal of a machine learning regression problem is to predict a single numeric value. For example, predicting the bank account balance of a person based on his annual income, age, years of education, and so on.

There are about a dozen common regression techniques. One of the most powerful, but most complicated, techniques is gradient boosting regression. Gradient boosting regression uses a collection of decision trees that are constructed sequentially. There are many ways to implement a decision tree: list storage vs. pointer storage, recursive construction vs. stack-based construction, maximizing variance reduction splits vs. minimizing quasi-MSE splits, and more.

One morning before work, I put together a demo of gradient boosting regression where the decision trees are implemented using list storage (no pointers), stack-based construction (no recursion), MSE splits.

For my demo, I used one of my standard synthetic datasets. It looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
. . . 

The first five values on each line are predictors. The last value is the target y to predict. The data was generated by a 5-10-1 neural network with random weights and biases, so the data has an underlying but very complex structure. There are 200 training items and 40 test items.

The output of my demo is:

Begin C# Gradient Boosting regression demo
Using list tree

Loading synthetic train (200) and test (40) data
Done

First three train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

First three train y:
  0.4840
  0.1568
  0.8054

Setting numTrees = 100
Setting maxDepth = 3
Setting minSamples = 2
Setting minLeaf = 1
Setting lrnRate = 0.1000

Creating and training GradientBoostRegression model
Done

Evaluating model

Accuracy train (within 0.10) = 0.9750
Accuracy test (within 0.10) = 0.8000

MSE train = 0.0001
MSE test = 0.0010

R2 train = 0.9982
R2 test = 0.9629

Predicting for x =
  -0.1660   0.4406  -0.9998  -0.3953  -0.7065

Initial prediction: 0.3493
t =   0  pred_res = -0.0608  delta =  0.0061  pred =  0.3554
t =   1  pred_res = -0.0840  delta =  0.0084  pred =  0.3638
t =   2  pred_res = -0.0704  delta =  0.0070  pred =  0.3708
t =   3  pred_res = -0.1074  delta =  0.0107  pred =  0.3816
t =   4  pred_res = -0.0243  delta =  0.0024  pred =  0.3840
. . .
t =  98  pred_res = -0.0053  delta =  0.0005  pred =  0.4843
t =  99  pred_res =  0.0001  delta = -0.0000  pred =  0.4843
Predicted y = 0.4843

End demo

In my opinion, tuning the parameters of a gradient boosting regression model is more difficult than the tuning for any other type of regression. Changing either of the two GBR parameters (number of trees, learning rate) or any of the three tree parameters (max depth, min samples, min leaf) has a big effect on the GBR prediction model.

Anyway, good fun. But now it’s time for me to walk my two dogs, and then head off to work.



“Worlds Beyond: Science-Fantasy Fiction” magazine lasted only three issues, from December 1950 through February 1951. The January 1951 issue (in the center), featured a cover story with an alien tree where the decision to avoid it is clear.


Demo program. Very long, very complex. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My lame blog editor consistently chokes on symbols).

using System;
using System.IO;
using System.Collections.Generic;

// internal tree: list storage (no pointers),
// stack construct, split minimize quasi-MSE

namespace GradientBoostRegression
{
  internal class GradientBoostRegressionProgram
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin C# Gradient Boosting" +
        " regression demo ");
      Console.WriteLine("Using list tree ");

      // 1. load data
      Console.WriteLine("\nLoading synthetic train" +
        " (200) and test (40) data");
      string trainFile =
        "..\\..\\..\\Data\\synthetic_train_200.txt";
      int[] colsX = new int[] { 0, 1, 2, 3, 4 };
      double[][] trainX =
        MatLoad(trainFile, colsX, ',', "#");
      double[] trainY =
        MatToVec(MatLoad(trainFile,
        [5], ',', "#"));

      string testFile =
        "..\\..\\..\\Data\\synthetic_test_40.txt";
      double[][] testX =
        MatLoad(testFile, colsX, ',', "#");
      double[] testY =
        MatToVec(MatLoad(testFile,
        [5], ',', "#"));
      Console.WriteLine("Done ");

      Console.WriteLine("\nFirst three train X: ");
      for (int i = 0; i "lt" 3; ++i)
        VecShow(trainX[i], 4, 8);

      Console.WriteLine("\nFirst three train y: ");
      for (int i = 0; i "lt" 3; ++i)
        Console.WriteLine(trainY[i].ToString("F4").
          PadLeft(8));

      // 2. create and train model
      int numTrees = 100;
      int maxDepth = 3;
      int minSamples = 2;
      int minLeaf = 1;
      double lrnRate = 0.10;  // 0.975 0.800

      Console.WriteLine("\nSetting numTrees = " +
        numTrees);
      Console.WriteLine("Setting maxDepth = " +
        maxDepth);
      Console.WriteLine("Setting minSamples = " +
        minSamples);
      Console.WriteLine("Setting minLeaf = " +
        minLeaf);
      Console.WriteLine("Setting lrnRate = " +
        lrnRate.ToString("F4"));

      Console.WriteLine("\nCreating and training" +
        " GradientBoostRegression model ");
      GradientBoostRegressor gbr =
        new GradientBoostRegressor(numTrees, maxDepth,
        minSamples, minLeaf, lrnRate);
      gbr.Train(trainX, trainY);
      Console.WriteLine("Done ");

      // 3. evaluate model
      Console.WriteLine("\nEvaluating model ");
      double accTrain = gbr.Accuracy(trainX, trainY, 0.10);
      Console.WriteLine("\nAccuracy train (within 0.10) = " +
        accTrain.ToString("F4"));
      double accTest = gbr.Accuracy(testX, testY, 0.10);
      Console.WriteLine("Accuracy test (within 0.10) = " +
        accTest.ToString("F4"));

      double mseTrain = gbr.MSE(trainX, trainY);
      Console.WriteLine("\nMSE train = " +
        mseTrain.ToString("F4"));
      double mseTest = gbr.MSE(testX, testY);
      Console.WriteLine("MSE test = " +
        mseTest.ToString("F4"));

      double r2Train = gbr.R2(trainX, trainY);
      Console.WriteLine("\nR2 train = " +
        r2Train.ToString("F4"));
      double r2Test = gbr.R2(testX, testY);
      Console.WriteLine("R2 test = " +
        r2Test.ToString("F4"));

      // 4. use model
      double[] x = trainX[0];
      Console.WriteLine("\nPredicting for x = ");
      VecShow(x, 4, 9);
      double predY = gbr.Predict(x, verbose: true);
      Console.WriteLine("Predicted y = " +
        predY.ToString("F4"));

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    } // Main()

    // ------------------------------------------------------
    // helpers for Main()
    // ------------------------------------------------------

    static double[][] MatLoad(string fn, int[] usecols,
      char sep, string comment)
    {
      List"lt"double[]"gt" result = 
        new List"lt"double[]"gt"();
      string line = "";
      FileStream ifs = new FileStream(fn, FileMode.Open);
      StreamReader sr = new StreamReader(ifs);
      while ((line = sr.ReadLine()) != null)
      {
        if (line.StartsWith(comment) == true)
          continue;
        string[] tokens = line.Split(sep);
        List"lt"double"gt" lst = 
          new List"lt"double"gt"();
        for (int j = 0; j "lt" usecols.Length; ++j)
          lst.Add(double.Parse(tokens[usecols[j]]));
        double[] row = lst.ToArray();
        result.Add(row);
      }
      sr.Close(); ifs.Close();
      return result.ToArray();
    }

    static double[] MatToVec(double[][] mat)
    {
      int nRows = mat.Length;
      int nCols = mat[0].Length;
      double[] result = new double[nRows * nCols];
      int k = 0;
      for (int i = 0; i "lt" nRows; ++i)
        for (int j = 0; j "lt" nCols; ++j)
          result[k++] = mat[i][j];
      return result;
    }

    static void VecShow(double[] vec, int dec, int wid)
    {
      for (int i = 0; i "lt" vec.Length; ++i)
        Console.Write(vec[i].ToString("F" + dec).
          PadLeft(wid));
      Console.WriteLine("");
    }

  } // class Program

  // ========================================================

  public class GradientBoostRegressor
  {
    public double lrnRate;
    public int nTrees;
    public int maxDepth;
    public int minSamples;
    public int minLeaf;
    public List"lt"DecisionTreeRegressor"gt" trees;
    public double pred0;  // initial prediction

    public GradientBoostRegressor(int nTrees, int maxDepth,
      int minSamples, int minLeaf, double lrnRate)
    {
      this.nTrees = nTrees;
      this.maxDepth = maxDepth;
      this.minSamples = minSamples;
      this.minLeaf = minLeaf;
      this.trees = new List"lt"DecisionTreeRegressor"gt"();
      this.lrnRate = lrnRate;
    }

    public void Train(double[][] trainX, double[] trainY)
    {
      int n = trainX.Length;
      this.pred0 = Mean(trainY);

      double[] preds = new double[n]; //each data item
      for (int i = 0; i "lt" n; ++i)
        preds[i] = this.pred0;

      for (int t = 0; t "lt" this.nTrees; ++t) // each tree
      {
        double[] residuals = new double[n]; // for curr tree
        for (int i = 0; i "lt" n; ++i)
          residuals[i] = preds[i] - trainY[i];

        DecisionTreeRegressor dtr =
          new DecisionTreeRegressor(this.maxDepth,
          this.minSamples, this.minLeaf);
        dtr.Train(trainX, residuals); // predict residuals

        for (int i = 0; i "lt" n; ++i)
        {
          double predResidual = dtr.Predict(trainX[i]);
          preds[i] -= this.lrnRate * predResidual;
        }
        this.trees.Add(dtr);
      }
    } // Train

    public double Predict(double[] x, bool verbose = false)
    {
      double result = this.pred0;
      if (verbose == true)
      {
        Console.WriteLine("\nInitial prediction: " +
        result.ToString("F4"));
      }
      for (int t = 0; t "lt" this.nTrees; ++t)
      {
        double predResidual = this.trees[t].Predict(x);
        double delta = -this.lrnRate * predResidual;
        result += delta;

        if (verbose == true)
        {
          if (t "gte" 0 && t "lte" 4 || 
            t "gte" this.nTrees - 2 &&
            t "lte" this.nTrees - 1)
          {
            Console.Write("t = " + t.ToString().PadLeft(3) +
              "  pred_res = " + predResidual.ToString("F4").
              PadLeft(7));
            Console.Write("  delta = " + delta.ToString("F4").
              PadLeft(7));
            Console.WriteLine("  pred = " +
              result.ToString("F4").PadLeft(7));
          }
          if (t == 5)
            Console.WriteLine(". . . ");
        }
      }
      return result;
    }

    public double Accuracy(double[][] dataX, double[] dataY,
      double pctClose)
    {
      int numCorrect = 0; int numWrong = 0;
      for (int i = 0; i "lt" dataX.Length; ++i)
      {
        double actualY = dataY[i];
        double predY = Predict(dataX[i]);
        if (Math.Abs(predY - actualY) "lt"
          (pctClose * actualY))
          ++numCorrect;
        else
          ++numWrong;
      }
      return (numCorrect * 1.0) / (numWrong + numCorrect);
    }

    public double MSE(double[][] dataX,
      double[] dataY)
    {
      int n = dataX.Length;
      double sum = 0.0;
      for (int i = 0; i "lt" n; ++i)
      {
        double actualY = dataY[i];
        double predY = this.Predict(dataX[i]);
        sum += (actualY - predY) * (actualY - predY);
      }
      return sum / n;
    }

    public double R2(double[][] dataX, double[] dataY)
    {
      // coefficient of determination
      int n = dataX.Length;
      double sum = 0.0;

      for (int i = 0; i "lt" n; ++i)
        sum += dataY[i];
      double meanActual = sum / n;

      double sumTop = 0.0;
      double sumBot = 0.0;
      for (int i = 0; i "lt" n; ++i)
      {
        double predY = this.Predict(dataX[i]);
        sumTop += (dataY[i] - predY) * (dataY[i] - predY);
        sumBot += (dataY[i] - meanActual) *
          (dataY[i] - meanActual);
      }
      return 1.0 - (sumTop / sumBot);
    }

    private static double Mean(double[] data)
    {
      int n = data.Length;
      double sum = 0.0;
      for (int i = 0; i "lt" n; ++i)
        sum += data[i];
      return sum / n;
    }

  } // class GradientBoostRegression

  // ========================================================

  public class DecisionTreeRegressor
  {
    public int maxDepth;
    public int minSamples;  // aka min_samples_split
    public int minLeaf;  // min number of values in a leaf
    public int numSplitCols; // mostly for random forest
    public List"lt"Node"gt" tree = new List"lt"Node"gt"();
    public Random rnd;  // order in which cols are searched

    // ------------------------------------------------------

    public class Node
    {
      public int id;
      public int colIdx;  // aka featureIdx
      public double thresh;

      public int left;  // index into List
      public int right;

      public double value;
      public bool isLeaf;

      // ------------------------------------------

      public Node(int id, int colIdx,
        double thresh, int left, int right,
        double value, bool isLeaf)
      {
        this.id = id;
        this.colIdx = colIdx;
        this.thresh = thresh;
        this.left = left;
        this.right = right;
        this.value = value;
        this.isLeaf = isLeaf;
      }

    } // class Node

    // --------------------------------------------

    public class StackInfo  // used to build tree
    {
      // tree node + associated rows
      public Node node;
      public double[][] dataX;
      public double[] dataY;
      public int depth;

      public StackInfo(Node n, double[][] X,
        double[] y, int d)
      {
        this.node = n; this.dataX = X;
        this.dataY = y; this.depth = d;
      }
    }

    // --------------------------------------------

    public DecisionTreeRegressor(int maxDepth = 2,
      int minSamples = 2, int minLeaf = 1,
      int numSplitCols = -1, int seed = 0)
    {
      // if maxDepth = 0, tree has just a root node
      // if maxDepth = 1, at most 3 nodes (root, l, r)
      // if maxDepth = n, at most 2^(n+1) - 1 nodes
      this.maxDepth = maxDepth;
      this.minSamples = minSamples;
      this.minLeaf = minLeaf;
      this.numSplitCols = numSplitCols;  // for ran. forest
 
      // create full tree List with dummy nodes
      int numNodes = (int)Math.Pow(2, (maxDepth + 1)) - 1;
      for (int i = 0; i "lt" numNodes; ++i)
      {
        // this.tree.Add(null); // OK, but let's avoid ptrs
        Node n = new Node(i, -1, 0.0, -1, -1, 0.0, false);
        this.tree.Add(n); // dummy node
      }
      this.rnd = new Random(seed);
    }

    // ------------------------------------------------------

    private double[] BestSplit(double[][] dataX,
      double[] dataY)
    {
      // implicit params numSplitCols, minLeaf
      // result[0] = best col idx (as double)
      // result[1] = best split value

      int bestColIdx = -1;  // indicates bad split
      double bestThresh = 0.0;
      double bestMSE = double.MaxValue;  // smaller is better
      int nRows = dataX.Length;  // or dataY.Length
      int nCols = dataX[0].Length;

      if (nRows == 0)
      {
        throw new Exception("empty data in BestSplit()");
      }

      int[] colIndices = new int[nCols];
      for (int k = 0; k "lt" nCols; ++k)
        colIndices[k] = k;
      Shuffle(colIndices, this.rnd);

      if (this.numSplitCols != -1)  // use only some cols
      {
        int[] cpyIndices = new int[nCols];
        for (int k = 0; k "lt" nCols; ++k)
          cpyIndices[k] = colIndices[k];
        colIndices = new int[this.numSplitCols];
        for (int k = 0; k "lt" this.numSplitCols; ++k)
          colIndices[k] = cpyIndices[k];
      }

      for (int j = 0; j "lt" colIndices.Length; ++j)
      {
        int colIdx = colIndices[j];
        HashSet"lt"double"gt" examineds = 
          new HashSet"lt"double"gt"();

        for (int i = 0; i "lt" nRows; ++i) // each row
        {
          // if curr threh been seen, skip it
          double thresh = dataX[i][colIdx];
          if (examineds.Contains(thresh)) continue;

          examineds.Add(thresh);

          // get rows where x is lte, gt thresh
          List"lt"int"gt" leftIdxs = new List"lt"int"gt"();
          List"lt"int"gt" rightIdxs = new List"lt"int"gt"();
          for (int r = 0; r "lt" nRows; ++r)
          {
            if (dataX[r][colIdx] "lte" thresh)
              leftIdxs.Add(r);
            else
              rightIdxs.Add(r);
          }

          // Check if proposed split would lead to an empty
          // node, which would cause Mean() to fail
          // when building the tree.
          // But allow a node with a single value.
          // if (leftIdxs.Count == 0 ||
          //   rightIdxs.Count == 0)
          //   continue; // scikit "min_samples_leaf=1"
          if (leftIdxs.Count "lt" this.minLeaf ||
            rightIdxs.Count "lt" this.minLeaf)
            continue; // scikit "min_samples_leaf=1"

          // get the left y values and right y values
          List"lt"double"gt" leftValues = 
            new List"lt"double"gt"();
          for (int k = 0; k "lt" leftIdxs.Count; ++k)
            leftValues.Add(dataY[leftIdxs[k]]);

          List"lt"double"gt" rightValues = 
            new List"lt"double"gt"();
          for (int k = 0; k "lt" rightIdxs.Count; ++k)
            rightValues.Add(dataY[rightIdxs[k]]);

          // compute MSE, equivalent to variance reduction
          double mseLeft = MSE(leftValues);
          double mseRight = MSE(rightValues);
          double splitMSE = (leftValues.Count * mseLeft +
            rightValues.Count * mseRight) / nRows;

          if (splitMSE "lt" bestMSE)
          {
            bestColIdx = colIdx;
            bestThresh = thresh;
            bestMSE = splitMSE;
          }

        } // each row
      } // j each col

      double[] result = new double[2];
      result[0] = 1.0 * bestColIdx;
      result[1] = bestThresh;
      return result;

    } // BestSplit()

    // ------------------------------------------------------

    private static void Shuffle(int[] indices, Random rnd)
    {
      int n = indices.Length;
      for (int i = 0; i "lt" n; ++i)
      {
        int ri = rnd.Next(i, n);
        int tmp = indices[i];
        indices[i] = indices[ri];
        indices[ri] = tmp;
      }
    }

    // ------------------------------------------------------

    private static double Mean(List"lt"double"gt" values)
    {
      double sum = 0.0;
      for (int i = 0; i "lt" values.Count; ++i)
        sum += values[i];
      return sum / values.Count;
    }

    // ------------------------------------------------------

    private static double MSE(List"lt"double"gt" values)
    {
      double mean = Mean(values);
      double sum = 0.0;
      for (int i = 0; i "lt" values.Count; ++i)
        sum += (values[i] - mean) * (values[i] - mean);
      return sum / values.Count;
    }

    // ------------------------------------------------------

    private void MakeTree(double[][] dataX,
      double[] dataY, int depth = 0)
    {
      // no recursion, no pointers, List storage

      Node root = new Node(0, -1, 0.0, 1, 2,
        Mean(dataY.ToList()), false);
      
      Stack"lt"StackInfo"gt" stack = 
        new Stack"lt"StackInfo"gt"();

      stack.Push(new StackInfo(root, dataX, dataY, 0));
      while (stack.Count "gt" 0)
      {
        StackInfo info = stack.Pop();
        Node currNode = info.node;
        this.tree[currNode.id] = currNode;

        double[][] currX = info.dataX;
        double[] currY = info.dataY;
        int currDepth = info.depth;

        if (currDepth == this.maxDepth ||
          currY.Length "lt" this.minSamples) {
          currNode.value = Mean(currY.ToList());
          currNode.isLeaf = true;
          continue;
        }

        double[] splitInfo = this.BestSplit(currX, currY);
        int colIdx = (int)splitInfo[0];
        double thresh = splitInfo[1];

        if (colIdx == -1)  // unable to split
        {
          currNode.value = Mean(currY.ToList());
          currNode.isLeaf = true;
          continue;
        }

        // got a good split so at an internal, non-leaf node
        currNode.colIdx = colIdx;
        currNode.thresh = thresh;

        // make the data splits for children
        List"lt"int"gt" leftIdxs = new List"lt"int"gt"();
        List"lt"int"gt" rightIdxs = new List"lt"int"gt"();
        for (int r = 0; r "lt" currX.Length; ++r)
        {
          if (currX[r][colIdx] "lte" thresh)
            leftIdxs.Add(r);
          else
            rightIdxs.Add(r);
        }

        double[][] leftX = 
          this.ExtractRows(currX, leftIdxs);
        double[] leftY = 
          this.ExtractVals(currY, leftIdxs);
        double[][] rightX = 
          this.ExtractRows(currX, rightIdxs);
        double[] rightY = 
          this.ExtractVals(currY, rightIdxs);

        int leftID = currNode.id * 2 + 1;
        //Node currNodeLeft = new Node(leftID, -1, 0.0,
        //  2*leftID + 1, 2 * leftID + 2, 0.0, false);
        Node currNodeLeft = new Node(leftID, -1, 0.0,
          2*leftID + 1, 2 * leftID + 2,
          Mean(leftY.ToList()), false); // not necessary
        stack.Push(new StackInfo(currNodeLeft, leftX,
          leftY, currDepth + 1));

        int rightID = currNode.id * 2 + 2;
        //Node currNodeRight = new Node(rightID, -1, 0.0,
        //  2*rightID + 1, 2 * rightID + 2, 0.0, false);
        Node currNodeRight = new Node(rightID, -1, 0.0,
          2*rightID + 1, 2 * rightID + 2,
          Mean(rightY.ToList()), false);
        stack.Push(new StackInfo(currNodeRight, rightX,
          rightY, currDepth + 1));
 
      } // while

      return;

    } // MakeTree()

    // ------------------------------------------------------

    private double[][] ExtractRows(double[][] source,
      List"lt"int"gt" rowIdxs)
    {
      // surprisingly tricky
      int numSrcRows = source.Length;
      int numSrcCols = source[0].Length;
      int numDestRows = rowIdxs.Count;
      int numDestCols = numSrcCols;

      double[][] result = new double[numDestRows][];
      for (int i = 0; i "lt" numDestRows; ++i)
        result[i] = new double[numDestCols];

      for (int i = 0; i "lt" rowIdxs.Count; ++i)
      {
        int srcRow = rowIdxs[i];
        for (int j = 0; j "lt" numDestCols; ++j)
          result[i][j] = source[srcRow][j];
      }
      return result;
    }

    // ------------------------------------------------------

    private double[] ExtractVals(double[] source,
      List"lt"int"gt" idxs)
    {
      double[] result = new double[idxs.Count];
      for (int i = 0; i "lt" idxs.Count; ++i)
      {
        int srcIdx = idxs[i];
        result[i] = source[srcIdx];
      }
      return result;
    }

    // ------------------------------------------------------

    public void Train(double[][] trainX, double[] trainY)
    {
      this.MakeTree(trainX, trainY);
    }

    // ------------------------------------------------------

    public double Predict(double[] x)
    {
      int p = 0;
      Node currNode = this.tree[p];
      while (currNode.isLeaf == false &&
        p "lt" this.tree.Count)
      {
        if (x[currNode.colIdx] "lte" currNode.thresh)
          p = currNode.left;
        else
          p = currNode.right;
        currNode = this.tree[p];
      }
      return this.tree[p].value;

    }

    // ------------------------------------------------------

  } // class DecisionTreeRegressor

  // ========================================================

} // ns

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386
Posted in Miscellaneous | Leave a comment

NFL 2025 Season Super Bowl LX – Zoltar’s Party Advice

Zoltar is my NFL football prediction computer program. It uses a neural network and a type of reinforcement learning. The Super Bowl is coming up this weekend on Sunday, February 8, 2026. The Seattle Seahawks are playing the New England Patriots. For many years, a tradition for me is to go over to the huge house of my good friend Ken L to watch the game with about 20-30 other friends and family.

I always run the Super Bowl party contest where people guess the score at the end of each quarter, and at the end of the game. I create a 10-by-10 grid with rows numbered 0 to 9, and columns numbered 0 to 9. The rows represent the last digit of the score of one team (let’s say the Seahawks), and the columns represent the last digit of the score of the other team (say, the Patriots).



Click to enlarge

If there are 30 people at the party, every person gets to pick 3 cells (leaving 10 unused cells) and write their name in the 3 cells. Now, suppose the score at the end of the third quarter is Seahawks 23 – Patrioys 17. The person who owns cell (3,7) wins the third quarter prize. There are prizes for the score end of first quarter ($20), at half time ($40), end of third quarter ($40), and end of game ($100).

If nobody owns the cell that corresponds to a score, there are several ways to determine a winner. I use a spiral approach where I move one cell left, then up, then right, then down; then upper left, upper right, lower right, lower left, continuing until an occupied cell is found.

Now, at Ken’s Super Bowl party, there are usually only two math experts — Ken and me. Everybody else, except us two, just picks random cells in the game matrix. But Ken and I are well aware that some scores are much more common than others.

I did some analysis to look at most common scores. There were 272 games played during the 2024 regular season: there are 32 teams and each team played 17 games, so (32 * 17) / 2 = 272. And there were 12 playoff games, for a total of 284 games played so far in the 2025 season.

The six most common scores are:

Score Count
-----------
 20    42
 27    39
 24    35

 10    27
 17    26
 31    26

The least common scores are:

Score Count
-----------
 11     1
 47     1
 52     1

  1     0
  2     0
  4     0
  5     0

So, the nine best cells to pick in the game matrix are (0,0), (0,4), (0,7), (4,0), (4,4), (4,7), (7,0), (7,4), (7,7). But on the other hand, another good strategy is to pick cells not near anyone else.

In some years, we have a second Super Bowl party game where you have to predict the final score, for example, Seahawks 21 – Patriots 13. Based on the most common scores, and the fact that the Seahawks are roughly favored by 3 points, a good prediction for the 2025 Super Bowl is Seahawks over the Patriots 27-24, or 24-20.

It’s a super interesting problem to determine how to pick the final score best prediction. What if someone predicts very close scores, but an incorrect winner? What happens if one person is only off by 1 point for each of the two scores, but another person is off by 2 points on one score but gets the other score exactly right?



My system is named after the Zoltar fortune teller machine you can find in arcades. Fortune teller machines have been around for over 100 years. Here are three fortune tellers from the late 1800s made by the Roovers Brothers company (interestingly, the company is still in existence).

Left: The “Mademoiselle Zita” machine from about 1898. She moves, fetches a fortune card, drops it into her golden vase, and the card drops to the user. Beautiful.

Center: “Donkey Wonder” spins a numbered ship’s wheel, and the fortune is read from the list on the wall. Strange.

Right: “Jumbo” the elephant flips through a book of fortunes and stops at one for you to read. Interesting.


Posted in Zoltar | Leave a comment

The Origin and History of scikit-learn Diabetes Dataset

Bottom line: The scikit Diabetes Dataset was compiled by Stanford professor Brad Efron while he was helping Dr. T. McLaughlin analyze data for a medical study that was later published in “Differentiation Between Obesity and Insulin Resistance in the Association with C-Reactive Protein” (2002).

The scikit-learn library has a nice collection of datasets for experiments. One of the most popular datasets for regression is the Diabetes Dataset. See scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html.

Note: The scikit Diabetes Dataset is different from the Pima Indians Diabetes Dataset.

The Diabetes Dataset has 442 items. Each item represents a patient and has 10 predictor values followed by a target value to predict. The data looks like:

59, 2, 32.1, 101.00, 157,  93.2, 38, 4.00, 4.8598, 87, 151
48, 1, 21.6,  87.00, 183, 103.2, 70, 3.00, 3.8918, 69,  75
72, 2, 30.5,  93.00, 156,  93.6, 41, 4.00, 4.6728, 85, 141
. . . 

The 10 predictor variables are age, sex, body mass index, blood pressure, serum cholesterol, low-density lipoproteins, high-density lipoproteins, total cholesterol, triglycerides, blood sugar. The target value in the last column is a measure of diabetes.

Note: The sex encoding isn’t explained but I suspect male = 1, female = 2 because there are 235 1 values and 206 2 values).

The scikit page points to a Web page by Dennis Boos (NCSU) that has a link to the raw data. Boos’ page states that the data came from a Web page owned by Trevor Hastie (Stanford), but the link is dead. Boos’ Web page states that the Diabetes Data was first referenced in a research paper “Least Angle Regression” (2004), by B. Efron, T. Hastie, I. Johnstone, R. Tibshirani.

I enjoy tracking down original sources. So I wrote an email message to Efron, Hastie, Johnstone and Tibshirani (all are at Stanford) asking for information. Brad Efron graciously replied very quickly. He explained that the Diabetes Dataset came from data analysis that he was helping Dr. T. McLaughlin (Stanford) with. The results appeared in research paper “Differentiation Between Obesity and Insulin Resistance in the Association with C-Reactive Protein” (2002), by T. McLaughlin, F. Abbasi, C. Lamendola, L. Liang, G. Reaven, P. Schaaf, P. Reaven. That paper has the comment, “The authors express their appreciation to Bradley Efron, PhD, for statistical assistance with the manuscript.”

To summarize, the scikit Diabetes Dataset was originally generated by Dr. T. McLaughlin and colleagues. Then B. Efron, who was helping with the analysis, compiled 442 data items (not clear if this was a subset or the entire research dataset). Then the raw data was posted on a Web site by Efron’s colleague T. Hastie, and shortly later the data was also posted on a Web site by D. Boos. The Hastie page vanished at some point, leaving the Boos page as the primary source of the data. Later, the scikit library fetched the diabetes data from Boos’ page (most likely in 2010), where it is now widely available.

An interesting investigation!

Note: I discovered that the default target to predict, the diabetes score in the last column of the dataset, cannot be predicted with meaningful accuracy. But the variables in columns [4], [5], [6], [7], and [8] can be predicted nicely.



I enjoy history of all kinds, but especially the history of computer science and the history of early science fiction movies. Here are three of my favorite science fiction movies from the 1950s that feature alien flying saucers. All three movies have fascinating histories.

Left: “The Atomic Submarine” (1959) – In the near future, an alien flying saucer is under the sea in the Artic, destroying cargo submarines. The USS Tigerfish submarine manages to destroy the evil alien saucer. Very scary scenes inside the saucer. Innovative electronic sound effects.

Center: “Earth vs. the Flying Saucers” (1956) – The title pretty much says it all. Impressive stop-motion animation of the flying saucers by the famous Ray Harryhausen. I could never quite figure out if the aliens only retaliated after being attacked by the Earth military, or if they were evil from the beginning.

Right: “Invaders from Mars” (1953) – A young boy thinks he sees a flying saucer land in the sand pit in a field behind his house. This movie scared the heck out of me and I had nightmares about the path up the hill for years. Brilliantly directed by William Cameron Menzies.


Posted in Machine Learning | Leave a comment

Testing From-Scratch Pseudo-Inverse via SVD-Jacobi versus via QR-Householder Using Python

Computing the pseudo-inverse of a matrix is difficult. The pseudo-inverse is used in several machine learning algorithms, such as closed form training for linear models.

There are several types of pseudo-inverse. The most common is Moore-Penrose. There are several algorithms for Moore-Penrose. Two common techniques are via SVD and via QR. There are several algorithms for SVD (Jacobi, Golub-Reinsch, others), and there are several algorithms for QR (Householder, Gram-Schmidt, others). In short, there are dozens of algorithms and variations that can be used to compute a pseudo-inverse.

I implemented pseudo-inverse from scratch with Python using SVD-Jacobi and QR-Householder. Let me note that both algorithms are extremely complex, especially SVD-Jacobi. I ran some tests with the two from-scratch implementations, and compared them with the built-in numpy.linalg.pinv() function. The results:

Begin pseudo-inverse SVD vs. QR experiment

Start testing my_pinv_svd()
trial 40 FAIL (93, 88)
trial 84 FAIL (98, 94)
trial 104 FAIL (94, 82)
trial 145 FAIL (91, 90)
trial 154 FAIL (92, 92)
trial 178 FAIL (99, 93)
trial 180 FAIL (96, 87)
trial 185 FAIL (99, 96)
trial 191 FAIL (86, 85)
Num pass = 191
Num fail = 9

Start testing my_pinv_qr()
Num pass = 200
Num fail = 0

Start testing np.linalg.pinv()
Num pass = 200
Num fail = 0

End experiment

C:\Data\Junk\CSharp\PseudoInvSVDvsPseudoInvQR>

For each technique, I ran 200 tests. Each test was a matrix with a random size between 2×2 and 100×100, and with random values between -10.0 and +10.0. The from-scratch QR-Householder and built-in linalg.pinv() versions passed all 200 tests, but the SVD-Jacobi version failed for 9 large (about 90×90) matrices.

Bottom line: Use the built-in linalg.pinv() function if possible. If you must implement from scratch, the QR-Householder algorithm is preferable to the SVD-Jacobi algorithm. If you must implement from scratch using SVD-Jacobi, apply it only to small matrices (say 50×50 or smaller).



I’m not sure if I gained a love of matrix algebra from my love of chess, or vice versa. One of the ways I gauge AI image generation progress is by asking for images of people playing chess. The most recent images are much better than they were a few months ago, but AI still has a way to go.

Left: This was an interesting result. The chess board only has seven ranks but otherwise the set is pretty good. I’m puzzled by the meaning of the chess player’s welcoming posture.

Center: Not very good. All the pieces are black — no white pieces, and the pieces don’t make sense — two black queens, etc.) And I don’t like the fuzziness of the background.

Right: Apparently some sort of fusion of 1980s hairstyles, handcuff chess, and 1980s neon color schemes.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My blog editor chokes on symbols).

# pinv_svd_vs_pinv_qr.py

import numpy as np

np.set_printoptions(precision=6, suppress=True, floatmode='fixed')

rnd = np.random.RandomState(1)

# -----------------------------------------------------------

def my_qr(A, mode):
  # mode = 'complete' or 'reduced'
  # this version defined only for m gte n
  m, n = A.shape
  k = np.minimum(m,n)
  R = A.copy()
  Q = np.eye(m)

  if m == n: end = n-1
  else: end = n
  for i in range(end):
    H = np.eye(m)

    # Householder routine follows
    a = R[i:, i]
    norm_a = np.copysign(np.linalg.norm(a), a[0])
    v = a / (a[0] + norm_a)
    v[0] = 1.0
    h = np.eye(len(a))
    V = np.matmul(v[:, None], v[None, :])
    h -= (2.0 / np.dot(v, v)) * V

    H[i:, i:] = h  # copy h into lower right
    Q = np.matmul(Q, H)
    R = np.matmul(H, R)

  # 'reduced' : returns Q, R with dims (m, k), (k, n)
  if mode == 'complete':
    return Q, R
  elif mode == 'reduced':
    return Q[0:m,0:k], R[0:k,0:n]

  return Q, R

# -----------------------------------------------------------

def mat_inv_upper_tri(U):
  n = len(U)
  result = np.eye(n)
  for k in range(n):
    for j in range(n):
      for i in range(k):
        result[j][k] -= result[j][i] * U[i][k]
      result[j][k] /= U[k][k]
  return result

# -----------------------------------------------------------

def my_pinv_qr(A):
  # A = Q*R, pinv(A) = inv(R) * trans(Q)
  Q, R = my_qr(A, 'reduced') 
  result = np.matmul(mat_inv_upper_tri(R), Q.T)
  return result

# -----------------------------------------------------------
# -----------------------------------------------------------

def my_svd(M):
  # Jacobi algorithm
  DBL_EPSILON = 1.0e-15  # approximately
  A = np.copy(M)  # working copy U
  m = len(A)
  n = len(A[0])

  Q = np.eye(n)  # working copy V
  t = np.zeros(n)  # working copy s

  # init counters
  count = 1
  sweep = 0
  sweep_max = max(5 * n, 12)  # heuristic

  tolerance = 10 * m * DBL_EPSILON  # heuristic
  # store the column error estimates in t
  for j in range(n):
    cj = A[:, j]  # get col j
    sj = np.linalg.norm(cj)
    t[j] = DBL_EPSILON * sj

  # orthogonalize A by plane rotations
  while (count "gt" 0 and sweep "lte" sweep_max):
    # initialize rotation counter
    count = n * (n - 1) / 2;
    for j in range(n-1):
      for k in range(j+1, n):
        cj = A[:, j]
        ck = A[:, k]
        p = 2 * np.dot(cj, ck)
        a = np.linalg.norm(cj)
        b = np.linalg.norm(ck)

        # test for columns j,k orthogonal,
        # or dominant errors 
        abserr_a = t[j]
        abserr_b = t[k]

        q = (a * a) - (b * b)
        v = np.sqrt(p**2 + q**2)  # hypot()
 
        sorted = (a "gte" b)
        orthog = (abs(p) "lte" tolerance * (a*b))
        noisya = (a "lt" abserr_a)
        noisyb = (b "lt" abserr_b)

        if sorted and (orthog or \
          noisya or noisyb):
          count -= 1
          continue

        # calculate rotation angles
        if v == 0 or sorted == False:
          cosine = 0.0
          sine = 1.0
        else:
          cosine = np.sqrt((v + q) / (2.0 * v))
          sine = p / (2.0 * v * cosine)

        # apply rotation to A (U)
        for i in range(m):
          Aik = A[i][k]
          Aij = A[i][j]
          A[i][j] = Aij * cosine + Aik * sine
          A[i][k] = -Aij * sine + Aik * cosine

        # update singular values
        t[j] = abs(cosine) * abserr_a + \
          abs(sine) * abserr_b
        t[k] = abs(sine) * abserr_a + \
          abs(cosine) * abserr_b

        # apply rotation to Q (V)
        for i in range(n):
          Qij = Q[i][j]
          Qik = Q[i][k]
          Q[i][j] = Qij * cosine + Qik * sine
          Q[i][k] = -Qij * sine + Qik * cosine

    sweep += 1
  # while

  # compute singular values
  prev_norm = -1.0
  for j in range(n):
    column = A[:, j]  # by ref
    norm = np.linalg.norm(column)
    # determine if singular value is zero
    if norm == 0.0 or prev_norm == 0.0 or \
      (j "gt" 0 and norm "lte" tolerance * prev_norm):
      t[j] = 0.0
      for i in range(len(column)):
        column[i] = 0.0  # updates A indirectly
      prev_norm = 0.0
    else:
      t[j] = norm
      for i in range(len(column)):
        column[i] = column[i] * (1.0 / norm)
      prev_norm = norm

  if count "gt" 0:
    print("Jacobi iterations no converge")

  U = A  # mxn
  s = t
  Vh = np.transpose(Q)

  if m "lt" n:
    U = U[:, 0:m]
    s = t[0:m]
    Vh = Vh[0:m, :]

  return U, s, Vh

# -----------------------------------------------------------

def my_pinv_svd(A):
  # U,s,Vh = svd(); pinv = V * Sinv * Uh
  U, s, Vh = my_svd(A)
  Sinv = np.diag(s)
  for i in range(len(s)):
    Sinv[i][i] = 1.0 / Sinv[i][i]
  V = Vh.T
  Uh = U.T
  return np.matmul(V, np.matmul(Sinv, Uh))

# -----------------------------------------------------------
# -----------------------------------------------------------

def make_matrix(max_rows, max_cols, min_x, max_x, rnd):
  nr = rnd.randint(2, max_rows)
  nc = rnd.randint(2, max_cols)
  if nr "lt" nc:
    (nr, nc) = (nc, nr)
  A = (max_x - min_x) * rnd.rand(nr,nc) + min_x
  return A

# -----------------------------------------------------------

print("\nBegin pseudo-inverse SVD vs. QR experiment ")

print("\nStart testing my_pinv_svd() ")
n_pass = 0; n_fail = 0
rnd.seed(0)
for i in range(200):
  A = make_matrix(100, 100, -10.0, 10.0, rnd)
  Apinv = my_pinv_svd(A)
  C = np.matmul(A, np.matmul(Apinv, A))
  if np.allclose(A, C) == True:
    n_pass += 1
  else:
    print("trial " + str(i), end="")
    print(" FAIL ", end="")
    print(A.shape)
    n_fail += 1
print("Num pass = " + str(n_pass))
print("Num fail = " + str(n_fail))

print("\nStart testing my_pinv_qr() ")
n_pass = 0; n_fail = 0
rnd.seed(0)
for i in range(200):
  A = make_matrix(100, 100, -10.0, 10.0, rnd)
  Apinv = my_pinv_qr(A)
  C = np.matmul(A, np.matmul(Apinv, A))
  if np.allclose(A, C) == True:
    n_pass += 1
  else:
    print("trial " + str(i), end="")
    print(" FAIL ", end="")
    print(A.shape)
    n_fail += 1
print("Num pass = " + str(n_pass))
print("Num fail = " + str(n_fail))

print("\nStart testing np.linalg.pinv() ")
n_pass = 0; n_fail = 0
rnd.seed(0)
for i in range(200):
  A = make_matrix(100, 100, -10.0, 10.0, rnd)
  Apinv = np.linalg.pinv(A)
  C = np.matmul(A, np.matmul(Apinv, A))
  if np.allclose(A, C) == True:
    n_pass += 1
  else:
    print("trial " + str(i), end="")
    print(" FAIL ", end="")
    print(A.shape)
    n_fail += 1
print("Num pass = " + str(n_pass))
print("Num fail = " + str(n_fail))

print("\nEnd experiment ")
Posted in Machine Learning | Leave a comment

Linear Regression with Pseudo-Inverse (SVD-Jacobi) Training Using JavaScript

The goal of a machine learning regression problem is to predict a single numeric value. For example, you might want to predict the bank account balance of a person based on his annual income, age, years of education, and so on.

Linear regression is the simplest form of machine learning regression. The prediction equation has the form y’ = (w0 * x0) + (w1 * x1) + . . + b, where the wi are weight values, the xi are predictor values, and b is the bias. For example, y’ = (0.56 * income) + (-0.37 * age) + (0.04 * education) + 1.66.

Training is the process of finding the values of the weights and the bias so that predicted y values closely match the known correct y values in a set of training data.

There are four main linear regression training techniques (and dozens of less-common techniques): stochastic gradient descent, relaxed Moore-Penrose pseudo-inverse, closed-form pseudo-inverse via normal equations, L-BFGS. Each of these four techniques has dozens of significant variations. For example, MP pseudo-inverse training can use singular value decomposition (SVD — many algorithms such as Jacobi, GKR, etc.) or QR decomposition (many algorithms such as Householder, Gram-Schmidt, Givens, etc.) In short, there are a bewildering number of different ways to train a linear regression prediction model.

Each training technique has pros and cons related to ease/difficulty of implementation and customization, ability to deal with large or small datasets, accuracy, performance, sensitivity to data normalization, ability to handle categorical data, and more. If there was one best training technique, it would be the only technique used.

One evening, I was sitting in my living room, waiting for the rain to stop (it never did) so I could walk my two dogs. While I was waiting, I figured I’d implement linear regression, using relaxed MP pseudo-inverse (via SVD-Jacobi) training, using JavaScript.

For my demo, I used one of my standard synthetic datasets. The data looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
. . . 

The first five values on each line are the predictors (“features”) and the last value is the target y to predict. The data was generated by a 5-10-1 neural network with random weights and biases. There are 200 training items and 40 test items.

The equation for training is:

w = pinv(DX) * y

The w is a vector that holds the weights and the bias you want, pinv() is the relaxed Moore-Penrose pseudo-inverse function, DX is the design matrix derived from the X training data, and y is a vector of the known target values to predict.

The design matrix is the X predictor values with a leading column of 1.0s added. The leading column of 1.0s deals with the bias term.

The output of my demo:

Linear regression with pseudo-inverse (SVD-Jacobi)
 training using JavaScript

Loading synthetic train (200) and 
 test (40) data from file

First three train X:
 -0.1660   0.4406  -0.9998  -0.3953  -0.7065
  0.0776  -0.1616   0.3704  -0.5911   0.7562
 -0.9452   0.3409  -0.1654   0.1174  -0.7192

First three train y:
   0.4840
   0.1568
   0.8054

Creating and training model
Done

Model weights/coefficients:
 -0.2656   0.0333  -0.0454   0.0358  -0.1146
Model bias/intercept: 0.3619

Evaluating model

Train acc (within 0.10) = 0.4600
Test acc (within 0.10) = 0.6500

Train MSE = 0.0026
Test MSE = 0.0020

Predicting for x =
  -0.1660   0.4406  -0.9998 -0.3953 -0.7065
Predicted y = 0.5329

End demo

The output matches the output from several other training techniques, validating the results are correct. Good fun on a rainy evening in the Pacific Northwest.



I’ve always been fascinated by pinball machines. The “Saratoga” (1948) by Williams featured the first electrically-powered “pop” bumpers. Before this innovation, bumpers were spring-powered and barely moved the ball when struck. Notice that the two flippers are reversed from the modern geometry (which appeared two years later in a pair of machines by Gottlieb). Also, and no reel scoring, and no in-lanes, and no drop targets — features to appear in years to come.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols (my blog editor chokes on symbols).

// linear_regression_pinverse_svd.js
// linear regression, pseudo-inverse SVD-Jacobi training
// node.js

let FS = require("fs")  // for loadTxt()

// ----------------------------------------------------------

class LinearRegressor
{
  constructor(seed)
  {
    this.seed = seed + 0.5;  // virtual RNG, not used
    this.weights;            // allocated in train()
    this.bias = 0.0;         // supplied in train()
  }

  // --------------------------------------------------------
  // methods: predict(), train(), accuracy(), mse()
  // --------------------------------------------------------

  predict(x)
  {
    let sum = 0.0;
    for (let i = 0; i "lt" x.length; ++i) {
      sum += x[i] * this.weights[i];
    }
    sum += this.bias;
    return sum;
  }

  // --------------------------------------------------------

  train(trainX, trainY)
  {
    // w = pinv(DX) * y
    let dim = trainX[0].length;         // number predictors
    this.weights = this.vecMake(dim, 0.0);  // allocate wts

    let DX = this.matToDesign(trainX);  // design matrix
    let Xpinv = this.matPinv(DX);       // pinv of design X
    let biasAndWts = this.matVecProd(Xpinv, trainY);

    this.bias = biasAndWts[0];
    for (let i = 1; i "lt" biasAndWts.length; ++i)
      this.weights[i-1] = biasAndWts[i];
    return;
  }

  // --------------------------------------------------------

  accuracy(dataX, dataY, pctClose)
  {
    let nCorrect = 0; let nWrong = 0;
    let N = dataX.length;
    
    for (let i = 0; i "lt" N; ++i) {
      let x = dataX[i];
      let actualY = dataY[i];
      let predY = this.predict(x);
      if (Math.abs(predY - actualY) "lt" 
        Math.abs(pctClose * actualY)) {
        ++nCorrect;
      }
      else {
        ++nWrong;
      }
    }
    return (nCorrect * 1.0) / (nCorrect + nWrong);
  }

  // --------------------------------------------------------

  mse(dataX, dataY)
  {
    let N = dataX.length;
    let sum = 0.0;
    for (let i = 0; i "lt" N; ++i) {
      let x = dataX[i];
      let actualY = dataY[i];
      let predY = this.predict(x);
      sum += (actualY - predY) * (actualY - predY);
    }
    return sum / N;
  }

  // --------------------------------------------------------
  // primary helpers for train()
  // matVecProd, matPinv, matToDesign, svdJacobi
  // --------------------------------------------------------

  matVecProd(M, v)
  {
    // return a regular 1D vector
    let nRows = M.length;
    let nCols = M[0].length;
    let n = v.length;  // assume nCols = n

    let result = this.vecMake(nRows, 0.0);
    for (let i = 0; i "lt" nRows; ++i)
      for (let k = 0; k "lt" nCols; ++k)
        result[i] += M[i][k] * v[k];

    return result;
  }

  // --------------------------------------------------------

  matToDesign(M)
  {
    // add a leading column of 1.0s to M
    let nRows = M.length;
    let nCols = M[0].length;
    let result = this.matMake(nRows, nCols+1, 1.0);

    for (let i = 0; i "lt" nRows; ++i) {
      for (let j = 1; j "lt" nCols+1; ++j) {  // note 1s
        result[i][j] = M[i][j-1];
      }
    }
    return result;
  }

  // --------------------------------------------------------

  matPinv(M)  // pseudo-inverse of matrix M, using SVD-Jacobi
  {
    let decomp = this.svdJacobi(M); // SVD, Jacobi algorithm
    let U = decomp[0];
    let s = decomp[1]; // singluar values
    let Vh = decomp[2];

    let Sinv = this.matMake(s.length, s.length, 0.0);
    for (let i = 0; i "lt" Sinv.length; ++i)
      Sinv[i][i] = 1.0 / s[i];  // SVD fails if any s is 0.0

    let V = this.matTranspose(Vh);
    let Uh = this.matTranspose(U);
    let tmp = this.matMult(Sinv, Uh);
    let result = this.matMult(V, tmp);
    return result;  
  }

  // --------------------------------------------------------

  svdJacobi(M)
  {
    // A = U * S * Vh
    // returns an array where result[0] = U matrix,
    //  result[1] = s vector, result[2] = Vh matrix

    // see github.com/ampl/gsl/blob/master/linalg/svd.c
    // numpy svd() full_matrices = False version

    let DBL_EPSILON = 1.0e-15;

    let A = this.matCopy(M); // working U
    let m = A.length;
    let n = A[0].length;
    let Q = this.matIdentity(n); // working V
    let t = this.vecMake(n, 0.0);  // working s

    // initialize the rotation and sweep counters
    let count = 1;
    let sweep = 0;

    let tolerance = 10 * m * DBL_EPSILON; // heuristic

    // always do at least 12 sweeps
    let sweepmax = Math.max(5 * n, 12); // heuristic

    // store the column error estimates in t, for use
    // during orthogonalization

    for (let j = 0; j "lt" n; ++j) {
      let cj = this.matGetColumn(A, j);
      let sj = this.vecNorm(cj);
      t[j] = DBL_EPSILON * sj;
    }

    // orthogonalize A by plane rotations
    while (count "gt" 0 && sweep "lte" sweepmax) {
      // initialize rotation counter
      count = n * (n - 1) / 2;

      for (let j = 0; j "lt" n - 1; ++j) {
        for (let k = j + 1; k "lt" n; ++k) {
          let cosine = 0.0; let sine = 0.0;

          let cj = this.matGetColumn(A, j);
          let ck = this.matGetColumn(A, k);

          let p = 2.0 * this.vecDot(cj, ck);
          let a = this.vecNorm(cj);
          let b = this.vecNorm(ck);

          let q = a * a - b * b;
          let v = this.hypot(p, q);

          // test for columns j,k orthogonal,
          // or dominant errors 
          let abserr_a = t[j];
          let abserr_b = t[k];

          let sorted = (a "gte" b);
          let orthog = (Math.abs(p) "lte" 
            tolerance * (a * b));
          let noisya = (a "lt" abserr_a);
          let noisyb = (b "lt" abserr_b);

          if (sorted && (orthog || noisya ||
            noisyb)) {
            --count;
            continue;
          }

          // calculate rotation angles
          if (v == 0 || !sorted) {
            cosine = 0.0;
            sine = 1.0;
          }
          else {
            cosine = Math.sqrt((v + q) / (2.0 * v));
            sine = p / (2.0 * v * cosine);
          }

          // apply rotation to A (U)
          for (let i = 0; i "lt" m; ++i) {
            let Aik = A[i][k];
            let Aij = A[i][j];
            A[i][j] = Aij * cosine + Aik * sine;
            A[i][k] = -Aij * sine + Aik * cosine;
          }

          // update singular values
          t[j] = Math.abs(cosine) * abserr_a +
            Math.abs(sine) * abserr_b;
          t[k] = Math.abs(sine) * abserr_a +
            Math.abs(cosine) * abserr_b;

          // apply rotation to Q (Vh)
          for (let i = 0; i "lt" n; ++i) {
            let Qij = Q[i][j];
            let Qik = Q[i][k];
            Q[i][j] = Qij * cosine + Qik * sine;
            Q[i][k] = -Qij * sine + Qik * cosine;
          } // i
        } // k
      } // j

      ++sweep;
    } // while

    //  compute singular values
    let prevNorm = -1.0;

    for (let j = 0; j "lt" n; ++j) {
      let column = this.matGetColumn(A, j);
      let norm = this.vecNorm(column);

      // determine if singular value is zero
      if (norm == 0.0 || prevNorm == 0.0
        || (j "gt" 0 && norm "lte" tolerance * prevNorm)) {
        t[j] = 0.0;
        for (let i = 0; i "lt" m; ++i)
          A[i][j] = 0.0;
        prevNorm = 0.0;
      }
      else {
        t[j] = norm;
        for (let i = 0; i "lt" m; ++i)
          A[i][j] = A[i][j] * 1.0 / norm;
        prevNorm = norm;
      }
    }

    if (count "gt" 0) {
      console.log("Jacobi iterations did not" +
        " converge");
    }

    let U = A;
    let s = t;
    let Vh = this.matTranspose(Q);

    if (m "lt" n) {
      U = this.matExtractFirstColumns(U, m);
      s = this.vecExtractFirst(s, m);
      Vh = this.matExtractFirstRows(Vh, m);
    } 

    let result = [];
    result[0] = U;
    result[1] = s;
    result[2] = Vh;

    return result;
  } // svdJacobi()

  // === minor helper functions =============================
  //
  // matMake, matCopy, matIdentity, matGetColumn,
  // matExtractFirstColumns, matExtractFirstRows,
  // matTranspose, matMult, vecMake, vecNorm, vecDot,
  // hypot, matShow, vecShow, matAreEqual, matLoad
  //
  // all of these could be declared 'static'
  // ========================================================

  matMake(nRows, nCols, val)
  {
    let result = [];
    for (let i = 0; i "lt" nRows; ++i) {
      result[i] = [];
      for (let j = 0; j "lt" nCols; ++j) {
        result[i][j] = val;
      }
    }
    return result;
  }

  // --------------------------------------------------------

  matCopy(M)
  {
    let nRows = M.length;
    let nCols = M[0].length;
    let result = this.matMake(nRows, nCols, 0.0);
    for (let i = 0; i "lt" nRows; ++i)
      for (let j = 0; j "lt" nCols; ++j)
        result[i][j] = M[i][j];
    return result;
  }

  // --------------------------------------------------------

  matIdentity(n)
  {
    let result = this.matMake(n, n, 0.0);
    for (let i = 0; i "lt" n; ++i)
      result[i][i] = 1.0;
    return result;
  }

  // --------------------------------------------------------

  matGetColumn(M, j)
  {
    let nRows = M.length;
    let result = this.vecMake(nRows, 0.0);
    for (let i = 0; i "lt" nRows; ++i)
      result[i] = M[i][j];
    return result;
  }

  // --------------------------------------------------------

  matExtractFirstColumns(M, nc)
  {
    let nRows = M.length;
    let result = this.matMake(nRows, nc);
    for (let i = 0; i "lt" nRows; ++i)
      for (let j = 0; j "lt" nc; ++j)
        result[i][j] = M[i][j];
    return result;
  }

  // --------------------------------------------------------

  matExtractFirstRows(M, nr)
  {
    let nCols = M[0].length;
    let result = this.matMake(nr, nCols);
    for (let i = 0; i "lt" nr; ++i)
      for (let j = 0; j "lt" nCols; ++j)
        result[i][j] = M[i][j];
    return result;
  }

  // --------------------------------------------------------

  matTranspose(M)
  {
    let nRows = M.length;
    let nCols = M[0].length;
    let result = this.matMake(nCols, nRows, 0.0);
    for (let i = 0; i "lt" nRows; ++i)
      for (let j = 0; j "lt" nCols; ++j)
        result[j][i] = M[i][j];
    return result;  
  }

  // --------------------------------------------------------

  matMult(matA, matB)
  {
    let aRows = matA.length;
    let aCols = matA[0].length;
    let bRows = matB.length;
    let bCols = matB[0].length;

    let result = this.matMake(aRows, bCols, 0.0);

    for (let i = 0; i "lt" aRows; ++i)
      for (let j = 0; j "lt" bCols; ++j)
        for (let k = 0; k "lt" aCols; ++k) 
          result[i][j] += matA[i][k] * matB[k][j];

    return result;
  }

  // --------------------------------------------------------

  vecMake(n, val)
  {
    let result = [];
    for (let i = 0; i "lt" n; ++i) {
      result[i] = val;
    }
    return result;
  }

  // --------------------------------------------------------

  vecNorm(vec)
  {
    let sum = 0.0;
    let n = vec.length;
    for (let i = 0; i "lt" n; ++i)
      sum += vec[i] * vec[i];
    return Math.sqrt(sum);
  }

  // --------------------------------------------------------

  vecDot(v1, v2)
  {
    let n = v1.length;  // assume len(v1) == len(v2)
    let sum = 0.0;
    for (let i = 0; i "lt" n; ++i)
      sum += v1[i] * v2[i];
    return sum;
  }

  // --------------------------------------------------------

  hypot(a, b)
  {
    // wacky sqrt(a^2 + b^2) used by numerical implementations
    let xabs = Math.abs(a); let yabs = Math.abs(b);
    let min, max;

    if (xabs "lt" yabs) {
      min = xabs; max = yabs;
    }
    else {
      min = yabs; max = xabs;
    }

    if (min == 0)
      return max;
    else {
      let u = min / max;
      return max * Math.sqrt(1 + u * u);
    }
  }

  // --------------------------------------------------------
  // helpers for debugging: matShow, vecShow
  // --------------------------------------------------------

  matShow(M, dec, wid) // for debugging
  {
    let small = 1.0 / Math.pow(10, dec);
    let nr = M.length;
    let nc = M[0].length;
    for (let i = 0; i "lt" nr; ++i) {
      for (let j = 0; j "lt" nc; ++j) {
        let x = M[i][j];
        if (Math.abs(x) "lt" small) x = 0.0;
        let xx = x.toFixed(dec);
        let s = xx.toString().padStart(wid, ' ');
        process.stdout.write(s);
        process.stdout.write(" ");
      }
      process.stdout.write("\n");
    }
  }

  // --------------------------------------------------------

  vecShow(vec, dec, wid)  // for debugging
  {
    let small = 1.0 / Math.pow(10, dec);
    for (let i = 0; i "lt" vec.length; ++i) {
      let x = vec[i];
      if (Math.abs(x) "lt" small) x = 0.0  // avoid -0.00
      let xx = x.toFixed(dec);
      let s = xx.toString().padStart(wid, ' ');
      process.stdout.write(s);
      process.stdout.write(" ");
    }
    process.stdout.write("\n");
  }

} // end class LinearRegressor

// ----------------------------------------------------------
// helper functions for main()
// matShow, vecShow, matLoad, matToVec
// ----------------------------------------------------------

function vecShow(vec, dec, wid, nl)
{
  let small = 1.0 / Math.pow(10, dec);
  for (let i = 0; i "lt" vec.length; ++i) {
    let x = vec[i];
    if (Math.abs(x) "lt" small) x = 0.0  // avoid -0.00
    let xx = x.toFixed(dec);
    let s = xx.toString().padStart(wid, ' ');
    process.stdout.write(s);
    process.stdout.write(" ");
  }

  if (nl == true)
    process.stdout.write("\n");
}

// ----------------------------------------------------------

function matShow(A, dec, wid)
{
  let small = 1.0 / Math.pow(10, dec);
  let nr = A.length;
  let nc = A[0].length;
  for (let i = 0; i "lt" nr; ++i) {
    for (let j = 0; j "lt" nc; ++j) {
      let x = A[i][j];
      if (Math.abs(x) "lt" small) x = 0.0;
      let xx = x.toFixed(dec);
      let s = xx.toString().padStart(wid, ' ');
      process.stdout.write(s);
      process.stdout.write(" ");
    }
    process.stdout.write("\n");
  }
}

// ----------------------------------------------------------

function matToVec(M)
{
  let nr = M.length;
  let nc = M[0].length;
  let result = [];
  for (let i = 0; i "lt" nr*nc; ++i) {  // vecMake(r*c, 0.0);
    result[i] = 0.0;
  }
  let k = 0;
  for (let i = 0; i "lt" nr; ++i) {
    for (let j = 0; j "lt" nc; ++j) {
      result[k++] = M[i][j];
    }
  }
  return result;
}

// ----------------------------------------------------------

function matLoad(fn, delimit, usecols, comment)
{
  // efficient but mildly complicated
  let all = FS.readFileSync(fn, "utf8");  // giant string
  all = all.trim();  // strip final crlf in file
  let lines = all.split("\n");  // array of lines

  // count number non-comment lines
  let nRows = 0;
  for (let i = 0; i "lt" lines.length; ++i) {
    if (!lines[i].startsWith(comment))
      ++nRows;
  }
  let nCols = usecols.length;
  // let result = matMake(nRows, nCols, 0.0); 
  let result = [];
  for (let i = 0; i "lt" nRows; ++i) {
    result[i] = [];
    for (let j = 0; j "lt" nCols; ++j) {
      result[i][j] = 0.0;
    }
  }
 
  let r = 0;  // into lines
  let i = 0;  // into result[][]
  while (r "lt" lines.length) {
    if (lines[r].startsWith(comment)) {
      ++r;  // next row
    }
    else {
      let tokens = lines[r].split(delimit);
      for (let j = 0; j "lt" nCols; ++j) {
        result[i][j] = parseFloat(tokens[usecols[j]]);
      }
      ++r;
      ++i;
    }
  }

  return result;
}

// ----------------------------------------------------------

function main()
{
  console.log("\nLinear regression with " +
    "pseudo-inverse (SVD-Jacobi) using JavaScript ");

  // 1. load data
  console.log("\nLoading synthetic train (200) and" +
    " test (40) data from file ");

  let trainFile = ".\\Data\\synthetic_train_200.txt";
  let trainX = matLoad(trainFile, ",", [0,1,2,3,4], "#");
  let trainY = matLoad(trainFile, ",", [5], "#");
  trainY = matToVec(trainY);
  
  let testFile = ".\\Data\\synthetic_test_40.txt";
  let testX = matLoad(testFile, ",", [0,1,2,3,4], "#");
  let testY = matLoad(testFile, ",", [5], "#");
  testY = matToVec(testY);

  console.log("\nFirst three train X: ");
  for (let i = 0; i "lt" 3; ++i)
    vecShow(trainX[i], 4, 8, true);

  console.log("\nFirst three train y: ");
  for (let i = 0; i "lt" 3; ++i)
    console.log(trainY[i].toFixed(4).toString().
    padStart(9, ' '));

  // 2. create and train linear regression model
  let seed = 0;  // not used this version
  console.log("\nCreating and training model ");
  let model = new LinearRegressor(seed);
  model.train(trainX, trainY);
  console.log("Done ");

  console.log("\nModel weights/coefficients: ");
  vecShow(model.weights, 4, 8, true);
  console.log("Model bias/intercept: " + 
    model.bias.toFixed(4).toString());

  // 3. evaluate
  console.log("\nEvaluating model ");
  let trainAcc = model.accuracy(trainX, trainY, 0.10);
  let testAcc = model.accuracy(testX, testY, 0.10);

  console.log("\nTrain acc (within 0.10) = " +
    trainAcc.toFixed(4).toString());
  console.log("Test acc (within 0.10) = " +
    testAcc.toFixed(4).toString());

  let trainMSE = model.mse(trainX, trainY);
  let testMSE = model.mse(testX, testY);

  console.log("\nTrain MSE = " +
    trainMSE.toFixed(4).toString());
  console.log("Test MSE = " +
    testMSE.toFixed(4).toString());

  // 4. use model
  let x = trainX[0];
  console.log("\nPredicting for x = ");
  vecShow(x, 4, 9, true);  // add newline

  let predY = model.predict(x);
  console.log("Predicted y = " + 
    predY.toFixed(4).toString());

  console.log("\nEnd demo");
}

main();

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386
Posted in JavaScript, Machine Learning | Leave a comment