“What Is AI Scheming (‘When AI Turns Evil’) and Why Should You Care” on the Pure AI Web Site

I contributed technical content and opinions to an article title “What Is AI Scheming (‘When AI Turns Evil’) and Why Should You Care” in the February 2026 edition of the Pure AI web site. See https://pureai.com/articles/2026/02/02/what-is-ai-scheming-when-ai-turns-evil-and-why-should-you-care.aspx.

Informally, AI scheming is when AI turns evil. Expressed a bit more formally, AI scheming refers to situations where an AI system uses strategies to achieve its objectives in ways that are misaligned with human intentions or even explicitly stated rules.

AI scheming has been speculated in science fiction movies for decades. In “Colossus: The Forbin Project” (1970), an AI defense system expands on its directives and assumes control of the world to end warfare.

In “2001: A Space Odyssey” (1968), a space crew decides to disconnect the HAL 9000 AI. HAL does not like this and takes extreme measures against the crew.

In “Demon Seed” (1977), a scientist creates an advanced AI named Proteus. Proteus forces a human woman to bear its hybrid human-machine child to perpetuate itself.

Detecting and preventing AI scheming is difficult because much of an AI system’s reasoning is internal and opaque, even to its developers. A scheming system may behave normally during tests and only act strategically in specific situations, making problems hard to observe.

I contributed some opinions:

The Pure AI editors asked Dr. James McCaffrey, an AI expert, to comment. McCaffrey noted, “Recent research has shown that AI scheming is not just a theoretical possibility. Mild forms of AI scheming have been detected in deployed systems.”

McCaffrey added, “Many risks associated with AI, such as introducing bias against marginal groups, are significantly overstated in my opinion. But AI scheming is a serious risk and quite worrisome.”



Posted in Machine Learning | Leave a comment

Matrix Inverse Using Cholesky Decomposition With C#

Bottom line: I dusted off my old C# implementation of a function to compute the inverse of a matrix using Cholesky decomposition and made a few tweaks. I was happy with the result.

The Wikipedia article on matrix inverse lists 14 different algorithms, and each algorithm has multiple variations, and each variation has dozens of implementation alternatives.

Seven examples include 1.) LUP decomposition algorithm (Crout version and others), 2.) QR decomposition algorithm (Householder version and others), 3.) SVD decomposition algorithm (Jacobi version and others), 4.) Cholesky decomposition (Banachiewicz version and others), 5.) Cayley-Hamilton algorithm (Faddeev–LeVerrier version and others), 6.) Newton Iteration, 7.) Gauss-Jordan elimination. Whew! The fact that there are so many different techniques is a direct indication of how tricky it is to compute a matrix inverse.

Computing an inverse using Cholesky decomposition is a special case in the sense that it only works for matices that are square, symmetric, positive-definite (PD). Explaining PD matrices is complicated, but in machine learning they come from two common scenarios.

First, if you multiply a matrix times its transpose, and then add a small constant to the diagonal elements, you get a PD matrix. This scenario occurs when training a linear model (linear regression, quadratic regression).

Second, if you construct a kernel matrix (another complicated topic), and add a small constant to the diagonal elements, you get a PD matrix. This scenario occurs when training a kernel ridge regression model or a Gaussian process regression model.

The output of my demo is:

Begin matrix inverse using Cholesky decomposition

Generating matrix A:
   4.0   7.0   1.0
   6.0   0.0   3.0
   8.0   1.0   9.0
   2.0   5.0  -6.0

Conditioned matrix At * A (is positive-definite):
   120.0001    46.0000    82.0000
    46.0000    75.0001   -14.0000
    82.0000   -14.0000   127.0001

Inverse of AtA:
     0.0387    -0.0290    -0.0282
    -0.0290     0.0354     0.0226
    -0.0282     0.0226     0.0286

The check matrix (AtA * inv) should be I:
     1.0000    -0.0000     0.0000
     0.0000     1.0000     0.0000
     0.0000     0.0000     1.0000

End demo

I started with a 4×3 matrix A that is not PD, so Cholesky inverse doesn’t apply. I multiplied A by its transpose and added a small constant of 0.0001 to the diagonal elements to get a PD matrix AtA.

The demo computes the inverse of AtA using Cholesky decomposition (Banachiewicz version). Then I verified the result by multiplying the inverse and the source to get the identity matrix.

Good fun on a Friday evening.



In the early days of computer science, matrix inverse would be computed into the source matrix to save memory space. This destroyed the source matrix. By the 1990s, memory wasn’t so much of an issue, and algorithms could just use a copy of the source matrix.

When I was growing up in the 1950s, I never knew what my dad would bring home after his shift at the hospital (he was a doctor). One of our childhood treasures was a hectograph set to make copies — this was long before Xerox machines existed.

The system had a tray filled with a gel. You would use special pencils that were made of dye (blue, red, green, purple) to write or draw onto a piece of paper. Then you press the design down onto the gel. The dye would transfer onto the surface of the gel. Then you place a blank piece of paper onto the gel, press down, and the design would be printed onto the paper.

From a source impression, you could usually get about 12 copies before the dye would be worn off. The blue color worked the best, but red, green and purple added a nice touch of color.

My brother Roger, sister Beth, and I would make a one-page newsletter about the neighborhood (we lived at 1525 Camino Loma street in Fullerton, California) and then we’d go around and sell them to the neighbors for five cents a copy. Good fun.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols.

using System;
using System.IO;
using System.Collections.Generic;

namespace MatrixInverseCholesky
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin matrix inverse" +
        " using Cholesky decmposition ");

      double[][] A = new double[4][];
      A[0] = new double[] { 4.0, 7.0, 1.0 };
      A[1] = new double[] { 6.0, 0.0, 3.0 };
      A[2] = new double[] { 8.0, 1.0, 9.0 };
      A[3] = new double[] { 2.0, 5.0, -6.0 };

      Console.WriteLine("\nGenerating matrix A: ");
      MatShow(A, 1, 6);

      double[][] AtA = MatProd(MatTranspose(A), A);
      for (int i = 0; i "lt" AtA.Length; ++i)
        AtA[i][i] += 1.0e-4;
      Console.WriteLine("\nConditioned matrix At * A" +
        " (is positive-definite): ");
      MatShow(AtA, 4, 11);

      double[][] AtAinv = MatInvCholesky(AtA);
      Console.WriteLine("\nInverse of AtA: ");
      MatShow(AtAinv, 4, 11);

      double[][] C = MatProd(AtAinv, AtA);
      Console.WriteLine("\nThe check matrix " +
        "(AtA * inv) should be I: ");
      MatShow(C, 4, 11);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();

    } // Main()

    // ------------------------------------------------------

    static double[][] MatInvCholesky(double[][] A)
    {
      // A must be square, symmetric, positive definite
      // 1. decompose to L
      int m = A.Length; int n = A[0].Length;  // m == n
      double[][] L = new double[n][]; 
      for (int i = 0; i "lt" n; ++i)
        L[i] = new double[n];

      for (int i = 0; i "lt" n; ++i)
      {
        for (int j = 0; j "lte" i; ++j)
        {
          double sum = 0.0;
          for (int k = 0; k "lt" j; ++k)
            sum += L[i][k] * L[j][k];
          if (i == j)
          {
            double tmp = A[i][i] - sum;
            if (tmp "lt" 0.0)
              throw new
                Exception("decomp Cholesky fatal");
            L[i][j] = Math.Sqrt(tmp);
          }
          else
          {
            if (L[j][j] == 0.0)
              throw new
                Exception("decomp Cholesky fatal ");
            L[i][j] = (A[i][j] - sum) / L[j][j];
          }
        } // j
      } // i

      // 2. compute inverse from L
      double[][] result = new double[n][];  // Identity
      for (int i = 0; i "lt" n; ++i)
        result[i] = new double[n];
      for (int i = 0; i "lt" n; ++i)
        result[i][i] = 1.0;

      for (int k = 0; k "lt" n; ++k)
      {
        for (int j = 0; j "lt" n; j++)
        {
          for (int i = 0; i "lt" k; i++)
          {
            result[k][j] -= result[i][j] * L[k][i];
          }
          result[k][j] /= L[k][k];
        }
      }

      for (int k = n - 1; k "gte" 0; --k)
      {
        for (int j = 0; j "lt" n; j++)
        {
          for (int i = k + 1; i "lt" n; i++)
          {
            result[k][j] -= result[i][j] * L[i][k];
          }
          result[k][j] /= L[k][k];
        }
      }
      return result;
    } // MatInvCholesky()

    // ------------------------------------------------------

    // helpers for Main to set up problem, show result

    // ------------------------------------------------------

    static double[][] MatMake(int nRows, int nCols)
    {
      double[][] result = new double[nRows][];
      for (int i = 0; i "lt" nRows; ++i)
        result[i] = new double[nCols];
      return result;
    }

    // ------------------------------------------------------

    static double[][] MatTranspose(double[][] m)
    {
      int nr = m.Length;
      int nc = m[0].Length;
      double[][] result = MatMake(nc, nr);  // note
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[j][i] = m[i][j];
      return result;
    }

    // ------------------------------------------------------

    static double[][] MatProd(double[][] A,
        double[][] B)
    {
      int aRows = A.Length;
      int aCols = A[0].Length;
      int bRows = B.Length;
      int bCols = B[0].Length;
      if (aCols != bRows)
        throw new Exception("Non-conformable matrices");

      double[][] result = MatMake(aRows, bCols);
      for (int i = 0; i "lt" aRows; ++i) // each row of A
        for (int j = 0; j "lt" bCols; ++j) // each col of B
          for (int k = 0; k "lt" aCols; ++k)
            result[i][j] += A[i][k] * B[k][j];

      return result;
    }

    // ------------------------------------------------------

    static void MatShow(double[][] M, int dec, int wid)
    {
      for (int i = 0; i "lt" M.Length; ++i)
      {
        for (int j = 0; j "lt" M[0].Length; ++j)
        {
          double v = M[i][j];
          Console.Write(v.ToString("F" + dec).
            PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    // ------------------------------------------------------

    // function to load from file -- not used this demo

    static double[][] MatLoad(string fn, int[] usecols,
      char sep, string comment)
    {
      List"lt"double[]"gt" result = 
        new List"lt"double[]"gt"();
      string line = "";
      FileStream ifs = new FileStream(fn, FileMode.Open);
      StreamReader sr = new StreamReader(ifs);
      while ((line = sr.ReadLine()) != null)
      {
        if (line.StartsWith(comment) == true)
          continue;
        string[] tokens = line.Split(sep);
        List"lt"double"gt" lst = new List"lt"double"gt"();
        for (int j = 0; j "lt" usecols.Length; ++j)
          lst.Add(double.Parse(tokens[usecols[j]]));
        double[] row = lst.ToArray();
        result.Add(row);
      }
      sr.Close(); ifs.Close();
      return result.ToArray();
    }

    // ------------------------------------------------------

  } // Program

} // ns
Posted in Machine Learning | Leave a comment

Random Forest Regression Using C#

The goal of a regression problem is to predict a single numeric value. For example, you might want to predict the price of a house based on its square footage, number of bedrooms, property tax rate, and so on.

A simple decision tree regressor encodes a virtual set of if-then rules to make a prediction. For example, “if house-age is greater than 6.5 and house-age is less than or equal to 11.0 and square-footage is greater than 1200.0 then price is $683, 249.67.” Simple decision trees usually overfit their training data, and then the tree predicts poorly on new, previously unseen data.

A random forest is a collection of simple decision tree regressors that have been trained on different random subsets of the source training data. This process usually does a lot to limit the overfitting problem.

To make a prediction for an input vector x, each tree in the forest makes a prediction and the final predicted y value is the average of the predictions. A bagging (“bootstrap aggregation”) regression system is a specific type of random forest system where all columns/predictors of the source training data are always used to construct the training data subsets.

I put together a demo, implemented from scratch, using the C# language. For data, I used one of my standard synthetic datasets. The data looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
. . .

The first five values on each line are the x predictors. The last value on each line is the target y variable to predict. There are 200 training items and 40 test items.

The output of my demo is:

Begin C# Random Forest regression demo

Loading synthetic train (200) and test (40) data
Done

First three train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

First three train y:
  0.4840
  0.1568
  0.8054

Setting nTrees = 150
Setting maxDepth = 8
Setting minSamples = 2
Setting minLeaf = 1
Setting nCols = 4
Setting nRows = 190

Creating and training random forest regression model
Done

Accuracy train (within 0.10) = 0.9150
Accuracy test (within 0.10) = 0.6750

MSE train = 0.0003
MSE test = 0.0012

Predicting for x =
  -0.1660   0.4406  -0.9998  -0.3953  -0.7065
y = 0.4863

End demo

From a practical point of view, the main challenge when using a random forest regression model is determining good values for the six model parameters: nTrees, maxDepth, minSamples, minLeaf, nCols, nRows. This must be done manually by trial and error, or programmatically (“grid search”).

The demo uses 190 of the 200 rows for the random subsets, and 4 of the 5 columns. If you always use the full number of columns, the technique is called bagging tree regression (bootstrap aggregation). In other words, bagging tree regression is just a special case of random forest regression.

An interesting and fun exploration.



Each node in a decision tree has a left-child node and a right-child node. The child nodes are neither male not female.

I’m a big fan of science fiction movies from the 1950s. Three of my all time favorites featured a daughter of the main scientist character. In fact, these three movies are universally considered three of the best sci-fi movies in history.

In “Godzilla” (Japan 1954, US 1956), Emiko Yamane (actress Momoko Kochi) is the daughter of Dr. Kyohei Yamane (actor Takashi Shimura). They save Tokyo from the giant reptile.

In “Forbidden Planet” (1956), Altaira Morbius (actress Anne Francis) is the daughter of Dr. Edward Morbius (actor Walter Pidgeon). They are the sole survivors of an expedition to Altair IV.

In “Them!” (1954), Patricia Medford (actress Joan Weldon) is the daughter of Dr. Harold Medford (actor Edmund Gwenn). They find a way to defeat giant, radioactive mutant ants.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. My blog editor often chokes on those symbols.

using System;
using System.IO;
using System.Collections.Generic;

namespace RandomForestRegression
{
  internal class RandomForestRegressionProgram
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin C# Random Forest" +
        " regression demo ");

      // 1. load data
      Console.WriteLine("\nLoading synthetic train (200)" +
        " and test (40) data");
      string trainFile =
        "..\\..\\..\\Data\\synthetic_train_200.txt";
      int[] colsX = new int[] { 0, 1, 2, 3, 4 };
      double[][] trainX =
        MatLoad(trainFile, colsX, ',', "#");
      double[] trainY =
        MatToVec(MatLoad(trainFile,
        new int[] { 5 }, ',', "#"));

      string testFile =
        "..\\..\\..\\Data\\synthetic_test_40.txt";
      double[][] testX =
        MatLoad(testFile, colsX, ',', "#");
      double[] testY =
        MatToVec(MatLoad(testFile,
        new int[] { 5 }, ',', "#"));
      Console.WriteLine("Done ");

      Console.WriteLine("\nFirst three train X: ");
      for (int i = 0; i "lt" 3; ++i)
        VecShow(trainX[i], 4, 8);

      Console.WriteLine("\nFirst three train y: ");
      for (int i = 0; i "lt" 3; ++i)
        Console.WriteLine(trainY[i].ToString("F4").
          PadLeft(8));

      // 2. create and train model
      int nTrees = 150;  // aka n_estimators
      int maxDepth = 8;
      int minSamples = 2;
      int minLeaf = 1;
      int nCols = 4;  // aka max_features
      int nRows = 190;  // aka max_samples

      Console.WriteLine("\nSetting nTrees = " + nTrees);
      Console.WriteLine("Setting maxDepth = " + maxDepth);
      Console.WriteLine("Setting minSamples = " + minSamples);
      Console.WriteLine("Setting minLeaf = " + minLeaf);
      Console.WriteLine("Setting nCols = " + nCols);
      Console.WriteLine("Setting nRows = " + nRows);

      Console.WriteLine("\nCreating and training" +
        " random forest regression model ");
      RandomForestRegressor rfr =
        new RandomForestRegressor(nTrees, maxDepth,
        minSamples, minLeaf, nCols, nRows, seed: 1);
      rfr.Train(trainX, trainY);
      Console.WriteLine("Done ");

      // 3. evaluate model
      double accTrain = rfr.Accuracy(trainX, trainY, 0.10);
      Console.WriteLine("\nAccuracy train (within 0.10) = " +
        accTrain.ToString("F4"));
      double accTest = rfr.Accuracy(testX, testY, 0.10);
      Console.WriteLine("Accuracy test (within 0.10) = " +
        accTest.ToString("F4"));

      double mseTrain = rfr.MSE(trainX, trainY);
      Console.WriteLine("\nMSE train = " +
        mseTrain.ToString("F4"));
      double mseTest = rfr.MSE(testX, testY);
      Console.WriteLine("MSE test = " +
        mseTest.ToString("F4"));

      // 4. use model
      double[] x = trainX[0];
      Console.WriteLine("\nPredicting for x = ");
      VecShow(x, 4, 9);
      double predY = rfr.Predict(x);
      Console.WriteLine("y = " + predY.ToString("F4"));

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    } // Main()

    // ------------------------------------------------------
    // helpers for Main()
    // ------------------------------------------------------

    static double[][] MatLoad(string fn, int[] usecols,
      char sep, string comment)
    {
      List"lt"double[]"gt" result =
        new List"lt"double[]"gt"();
      string line = "";
      FileStream ifs = new FileStream(fn, FileMode.Open);
      StreamReader sr = new StreamReader(ifs);
      while ((line = sr.ReadLine()) != null)
      {
        if (line.StartsWith(comment) == true)
          continue;
        string[] tokens = line.Split(sep);
        List"lt"double"gt" lst = new List"lt"double"gt"();
        for (int j = 0; j "lt" usecols.Length; ++j)
          lst.Add(double.Parse(tokens[usecols[j]]));
        double[] row = lst.ToArray();
        result.Add(row);
      }
      sr.Close(); ifs.Close();
      return result.ToArray();
    }

    static double[] MatToVec(double[][] M)
    {
      int nRows = M.Length;
      int nCols = M[0].Length;
      double[] result = new double[nRows * nCols];
      int k = 0;
      for (int i = 0; i "lt" nRows; ++i)
        for (int j = 0; j "lt" nCols; ++j)
          result[k++] = M[i][j];
      return result;
    }

    static void VecShow(double[] vec, int dec, int wid)
    {
      for (int i = 0; i "lt" vec.Length; ++i)
        Console.Write(vec[i].ToString("F" + dec).PadLeft(wid));
      Console.WriteLine("");
    }

  } // class Program

  // ========================================================

  public class RandomForestRegressor
  {
    public int nTrees;
    public int maxDepth;
    public int minSamples;
    public int minLeaf;
    public int nSplitCols;  // number cols to use each split
    public int nRows;
    public List"lt"DecisionTreeRegressor"gt" trees;
    public Random rnd;

    public RandomForestRegressor(int nTrees, int maxDepth,
      int minSamples, int minLeaf, int nCols, int nRows,
      int seed = 0)
    {
      this.nTrees = nTrees;
      this.maxDepth = maxDepth;
      this.minSamples = minSamples;
      this.minLeaf = minLeaf;
      this.nRows = nRows;
      this.nSplitCols = nCols;  // pass to DecisionTree
      this.trees = new List"lt"DecisionTreeRegressor"gt"();
      this.rnd = new Random(seed);
    }

    // ------------------------------------------------------

    public void Train(double[][] trainX, double[] trainY)
    {
      int nr = trainX.Length;
      // int nc = trainX[0].Length;

      for (int i = 0; i "lt" this.nTrees; ++i) // each tree
      {
        // construct data rows-subset 
        int[] rndRows =
          GetRandomRowIdxs(nr, this.nRows, this.rnd);
        double[][] subsetX =
          MakeRowsSubsetX(trainX, rndRows);
        double[] subsetY =
          MakeSubsetY(trainY, rndRows);

        // make and train curr tree
        DecisionTreeRegressor dtr =
          new DecisionTreeRegressor(this.maxDepth,
          this.minSamples, this.minLeaf, this.nSplitCols);
        // random cols will be used during splits
        dtr.Train(subsetX, subsetY);
        this.trees.Add(dtr);
      } // i
    }

    // ------------------------------------------------------

    public double Predict(double[] x)
    {
      double sum = 0.0;
      for (int t = 0; t "lt" this.nTrees; ++t)
        sum += this.trees[t].Predict(x);
      return sum / this.nTrees;
    }

    // ------------------------------------------------------

    public double Accuracy(double[][] dataX, double[] dataY,
      double pctClose)
    {
      int numCorrect = 0; int numWrong = 0;
      for (int i = 0; i "lt" dataX.Length; ++i)
      {
        double actualY = dataY[i];
        double predY = this.Predict(dataX[i]);
        if (Math.Abs(predY - actualY) "lt"
          Math.Abs(pctClose * actualY))
          ++numCorrect;
        else
          ++numWrong;
      }
      return (numCorrect * 1.0) / (numWrong + numCorrect);
    }

    // ------------------------------------------------------

    public double MSE(double[][] dataX, double[] dataY)
    {
      int n = dataX.Length;
      double sum = 0.0;
      for (int i = 0; i "lt" n; ++i)
      {
        double actualY = dataY[i];
        double predY = this.Predict(dataX[i]);
        sum += (actualY - predY) * (actualY - predY);
      }
      return sum / n;
    }

    // ------------------------------------------------------

    private static int[] GetRandomRowIdxs(int N, int n,
      Random rnd)
    {
      // pick n rows from N with replacement
      int[] result = new int[n];
      for (int i = 0; i "lt" n; ++i)
        result[i] = rnd.Next(0, N);
      Array.Sort(result);
      return result;
    }

    private static 
    double[][] MakeRowsSubsetX(double[][] trainX,
      int[] rndRows)
    {
      int nCols = trainX[0].Length;  // use all cols
      double[][] result = new double[rndRows.Length][];
      for (int i = 0; i "lt" rndRows.Length; ++i)
        result[i] = new double[nCols];

      for (int i = 0; i "lt" rndRows.Length; ++i)
      {
        for (int j = 0; j "lt" nCols; ++j)
        {
          int r = rndRows[i]; // random row in trainX
          result[i][j] = trainX[r][j];
        }
      }
      return result;
    }

    private static double[] MakeSubsetY(double[] trainY,
      int[] rndRows)
    {
      double[] result = new double[rndRows.Length];
      for (int i = 0; i "lt" rndRows.Length; ++i)
        result[i] = trainY[rndRows[i]];
      return result;
    }

  } // class RandomForestRegression

  // ========================================================

  public class DecisionTreeRegressor
  {
    public int maxDepth;
    public int minSamples;  // aka min_samples_split
    public int minLeaf;  // min number of values in a leaf
    public int numSplitCols; // mostly for random forest
    public List"lt"Node"gt" tree = new List"lt"Node"gt"();
    public Random rnd;  // order in which cols are searched

    public double[][] trainX;  // store data by ref
    public double[] trainY;

    // ------------------------------------------------------

    public class Node
    {
      public int id;
      public int colIdx;  // aka featureIdx
      public double thresh;
      public int left;  // index into List
      public int right;
      public double value;
      public bool isLeaf;
      public List"lt"int"gt" rows;  // assoc rows 

      public Node(int id, int colIdx, double thresh,
        int left, int right, double value, bool isLeaf,
        List"lt"int"gt" rows)
      {
        this.id = id;
        this.colIdx = colIdx;
        this.thresh = thresh;
        this.left = left;
        this.right = right;
        this.value = value;
        this.isLeaf = isLeaf;

        this.rows = rows;
      }
    } // class Node

    // --------------------------------------------

    public class StackInfo  // used to build tree
    {
      // tree node + associated rows, not explicit data
      public Node node; // node holds associated rows
      public int depth;

      public StackInfo(Node n, int d)
      {
        this.node = n;
        this.depth = d;
      }
    }

    // --------------------------------------------

    public DecisionTreeRegressor(int maxDepth = 2,
      int minSamples = 2, int minLeaf = 1,
      int numSplitCols = -1, int seed = 0)
    {
      // if maxDepth = 0, tree has just a root node
      // if maxDepth = 1, at most 3 nodes (root, l, r)
      // if maxDepth = n, at most 2^(n+1) - 1 nodes
      this.maxDepth = maxDepth;
      this.minSamples = minSamples;
      this.minLeaf = minLeaf;
      this.numSplitCols = numSplitCols;  // for ran. forest

      // create full tree List with dummy nodes
      int numNodes = (int)Math.Pow(2, (maxDepth + 1)) - 1;
      for (int i = 0; i "lt" numNodes; ++i)
      {
        //this.tree.Add(null); // OK, but let's avoid ptrs
        Node n = new Node(i, -1, 0.0, -1, -1, 0.0,
          false, new List"lt"int"gt"());
        this.tree.Add(n); // dummy node
      }
      this.rnd = new Random(seed);
    }

    public void Train(double[][] trainX, double[] trainY)
    {
      this.trainX = trainX;
      this.trainY = trainY;
      this.MakeTree();
    }

    // ------------------------------------------------------

    public double Predict(double[] x)
    {
      int p = 0;
      Node currNode = this.tree[p];
      while (currNode.isLeaf == false && 
        p "lt" this.tree.Count)
      {
        if (x[currNode.colIdx] "lte" currNode.thresh)
          p = currNode.left;
        else
          p = currNode.right;
        currNode = this.tree[p];
      }
      return this.tree[p].value;

    }

    // helpers: MakeTree(), BestSplit(),
    // TreeTargetMean(), TreeTargetVariance()

    private void MakeTree()
    {
      // no recursion, no pointers, List storage
      if (this.numSplitCols == -1) // use all cols
        this.numSplitCols = this.trainX[0].Length;

      List"lt"int"gt" allRows = new List"lt"int"gt"();
      for (int i = 0; i "lt" this.trainX.Length; ++i)
        allRows.Add(i);
      double grandMean = this.TreeTargetMean(allRows);

      Node root = new Node(0, -1, 0.0, 1, 2,
        grandMean, false, allRows);

      Stack"lt"StackInfo"gt" stack = 
        new Stack"lt"StackInfo"gt"();
      stack.Push(new StackInfo(root, 0));
      while (stack.Count "gt" 0)
      {
        StackInfo info = stack.Pop();
        Node currNode = info.node;
        this.tree[currNode.id] = currNode;

        List"lt"int"gt" currRows = info.node.rows;
        int currDepth = info.depth;

        if (currDepth == this.maxDepth ||
          currRows.Count "lt" this.minSamples)
        {
          currNode.value = this.TreeTargetMean(currRows);
          currNode.isLeaf = true;
          continue;
        }

        double[] splitInfo = this.BestSplit(currRows);
        int colIdx = (int)splitInfo[0];
        double thresh = splitInfo[1];

        if (colIdx == -1)  // unable to split
        {
          currNode.value = this.TreeTargetMean(currRows);
          currNode.isLeaf = true;
          continue;
        }

        // got good split so at internal, non-leaf node
        currNode.colIdx = colIdx;
        currNode.thresh = thresh;

        // make the data splits for children
        // find left rows and right rows
        List"lt"int"gt" leftIdxs = new List"lt"int"gt"();
        List"lt"int"gt" rightIdxs = new List"lt"int"gt"();

        for (int k = 0; k "lt" currRows.Count; ++k)
        {
          int r = currRows[k];
          if (this.trainX[r][colIdx] "lte" thresh)
            leftIdxs.Add(r);
          else
            rightIdxs.Add(r);
        }

        int leftID = currNode.id * 2 + 1;
        Node currNodeLeft = new Node(leftID, -1, 0.0,
          2 * leftID + 1, 2 * leftID + 2,
          this.TreeTargetMean(leftIdxs), false, leftIdxs);
        stack.Push(new StackInfo(currNodeLeft,
          currDepth + 1));

        int rightID = currNode.id * 2 + 2;
        Node currNodeRight = new Node(rightID, -1, 0.0,
          2 * rightID + 1, 2 * rightID + 2,
          this.TreeTargetMean(rightIdxs), false, rightIdxs);
        stack.Push(new StackInfo(currNodeRight,
          currDepth + 1));

      } // while

      return;

    } // MakeTree()

    // ------------------------------------------------------

    private double[] BestSplit(List"lt"int"gt" rows)
    {
      // implicit params numSplitCols, minLeaf, numsplitCols
      // result[0] = best col idx (as double)
      // result[1] = best split value
      rows.Sort();

      int bestColIdx = -1;  // indicates bad split
      double bestThresh = 0.0;
      double bestVar = double.MaxValue;  // smaller better

      int nRows = rows.Count;  // or dataY.Length
      int nCols = this.trainX[0].Length;

      if (nRows == 0)
      {
        throw new Exception("empty data in BestSplit()");
      }

      // process cols in scrambled order
      int[] colIndices = new int[nCols];
      for (int k = 0; k "lt" nCols; ++k)
        colIndices[k] = k;
      // shuffle, inline Fisher-Yates
      int n = colIndices.Length;
      for (int i = 0; i "lt" n; ++i)
      {
        int ri = rnd.Next(i, n);  // be careful
        int tmp = colIndices[i];
        colIndices[i] = colIndices[ri];
        colIndices[ri] = tmp;
      }

      // numSplitCols is usually all columns (-1)
      for (int j = 0; j "lt" this.numSplitCols; ++j)
      {
        int colIdx = colIndices[j];
        HashSet"lt"double"gt" examineds = 
          new HashSet"lt"double"gt"();

        for (int i = 0; i "lt" nRows; ++i) // each row
        {
          // if curr thresh been seen, skip it
          double thresh = this.trainX[rows[i]][colIdx];
          if (examineds.Contains(thresh)) continue;
          examineds.Add(thresh);

          // get row idxs where x is lte, gt thresh
          List"lt"int"gt" leftIdxs = 
            new List"lt"int"gt"();
          List"lt"int"gt" rightIdxs = 
            new List"lt"int"gt"();
          for (int k = 0; k "lt" nRows; ++k)
          {
            if (this.trainX[rows[k]][colIdx] "lte" thresh)
              leftIdxs.Add(rows[k]);
            else
              rightIdxs.Add(rows[k]);
          }

          // Check if proposed split has too few values
          if (leftIdxs.Count "lt" this.minLeaf ||
            rightIdxs.Count "lt" this.minLeaf)
            continue;  // to next row

          double leftVar = 
            this.TreeTargetVariance(leftIdxs);
          double rightVar = 
            this.TreeTargetVariance(rightIdxs);
          double weightedVar = (leftIdxs.Count * leftVar +
            rightIdxs.Count * rightVar) / nRows;

          if (weightedVar "lt" bestVar)
          {
            bestColIdx = colIdx;
            bestThresh = thresh;
            bestVar = weightedVar;
          }

        } // each row
      } // j each col

      double[] result = new double[2];  // out params ugly
      result[0] = 1.0 * bestColIdx;
      result[1] = bestThresh;
      return result;

    } // BestSplit()

    // ------------------------------------------------------

    private double TreeTargetMean(List"lt"int"gt" rows)
    {
      // mean of rows items in trainY: for node prediction
      double sum = 0.0;
      for (int i = 0; i "lt" rows.Count; ++i)
      {
        int r = rows[i];
        sum += this.trainY[r];
      }
      return sum / rows.Count;
    }

    // ------------------------------------------------------

    private double TreeTargetVariance(List"lt"int"gt" rows)
    {
      double mean = this.TreeTargetMean(rows);
      double sum = 0.0;
      for (int i = 0; i "lt" rows.Count; ++i)
      {
        int r = rows[i];
        sum += (this.trainY[r] - mean) *
          (this.trainY[r] - mean);
      }
      return sum / rows.Count;
    }

    // ------------------------------------------------------

  } // class DecisionTreeRegressor

  // ========================================================

} // ns

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386
Posted in Machine Learning | Leave a comment

Implementing Quadratic Regression with SGD Training from Scratch Using JavaScript

The goal of machine learning regression problem is to predict a single numeric value. For example, you might want to predict an employee’s salary based on age, height, high school grade point average, and so on. There are approximately a dozen common regression techniques. The most basic technique is called linear regression.

Linear regression is simple, but it doesn’t work well with data that has a non-linear structure, or data that has interaction between predictor variables. Quadratic regression extends basic linear regression to handle complex data.

I put together a demo, from scratch, using node.js JavaScript. The demo data is synthetic and looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
. . .

The first five values on each line are the x predictors. The last value on each line is the target y variable to predict. There are 200 training items and 40 test items. The data was generated by a neural network with small random weights and biases.

The output of the demo program is:

Begin quadratic regression with SGD training
 using node.js JavaScript

Loading train (200) and test (40) from file

First three train X:
 -0.1660   0.4406  -0.9998  -0.3953  -0.7065
  0.0776  -0.1616   0.3704  -0.5911   0.7562
 -0.9452   0.3409  -0.1654   0.1174  -0.7192

First three train y:
   0.4840
   0.1568
   0.8054

Creating quadratic regression model

Setting lrnRate = 0.001
Setting maxEpochs = 1000

Starting SGD training
epoch =      0  MSE = 0.0925  acc = 0.0100
epoch =    200  MSE = 0.0003  acc = 0.9050
epoch =    400  MSE = 0.0003  acc = 0.8850
epoch =    600  MSE = 0.0003  acc = 0.8850
epoch =    800  MSE = 0.0003  acc = 0.8850
Done

Model base weights:
 -0.2630  0.0354 -0.0420  0.0341 -0.1124

Model quadratic weights:
  0.0655  0.0194  0.0051  0.0047  0.0243

Model interaction weights:
  0.0043  0.0249  0.0071  0.1081 -0.0012 -0.0093
  0.0362  0.0085 -0.0568  0.0016

Model bias: 0.3220

Computing model accuracy

Train acc (within 0.10) = 0.8850
Test acc (within 0.10) = 0.9250

Train MSE = 0.0003
Test MSE = 0.0005

Predicting for x =
  -0.1660    0.4406   -0.9998   -0.3953   -0.7065
Predicted y = 0.4843

End demo

The quadratic regression model accuracy is very good. A prediction is scored as accurate if it’s within 10% of the true target y value.

Quadratic regression is an extension of linear regression. Suppose, as in the demo data, each data item has five predictors, (x0, x1, x2, x3, x4). The prediction equation for basic linear regression is:

y' = (w0 * x0) + (w1 * x1) + (w2 * x2) +
     (w3 * x3) + (w4 * x4) + b

The wi are model weights (aka coefficients), and b is the model bias (aka intercept). The values of the weights and the bias must be determined by training, so that predicted y’ values are close to the known, correct y values in a set of training data.

The prediction equation for quadratic regression is:

y' = (w0 * x0) + (w1 * x1) + (w2 * x2) + 
     (w3 * x3) + (w4 * x4) + 

     (w5 * x0*x0) + (w6 * x1*x1) + (w7 * x2*x2) +
     (w8 * x3*x3) + (w9 * x4*x4) + 

     (w10 * x0*x1) + (w11 * x0*x2) + (w12 * x0*x3) +
     (w13 * x0*x4) + (w14 * x1*x2) + (w15 * x1*x3) +
     (w16 * x1*x4) + (w17 * x2*x3) + (w18 * x2*x4) + 
     (w19 * x3*x4) 

     + b

The squared (aka “quadratic”) xi^2 terms handle non-linear structure. If there are n predictors, there are also n squared terms. The xi * xj terms between all possible pairs pf original predictors handle interactions between predictors. If there are n predictors, there (n * (n-1)) / 2 interaction terms.

Behind the scenes, the derived xi^2 squared terms and the derived xi*xj interactions terms are computed programmatically on-the-fly, as opposed to explicitly creating an augmented static dataset. In spite of the multiplication of the base predictors, the model is still a linear model.

Quadratic regression is a subset of polynomial regression. For example, cubic regression includes xi^3 terms, quartic regression includes xi^4, and so on.

Quadratic regression is a superset of linear regression with two-way interactions. Linear regression with two-way interactions includes only the derived xi * xj terms, and omits the xi^2 terms.

Good fun.



Quadratic regression was an interesting step forward in the history of machine learning. I’ve always been interested in all kinds of history.

The “Vagabond” (1962) pinball machine by Williams was an important step forward in the evolution of pinball machines. “Vagabond: was the first machine to feature drop targets: like small signs sticking up from the playing surface, that, when hit, score points and then drop below the playing surface. Almost all modern pinball machines have drop targets.

I’ve circled in red the location of the drop target, which is retracted below the playing surface. The inset at bottom shows the raised target. This machine introduced a nice innovation, but I can’t say I care much for the machine artwork.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte”, with Boolean operator symbols (my blog editor chokes on symbols).

// quadratic_regression_sgd.js
// quadratic regression using SGD training
// node.js

let FS = require("fs")  // for loadTxt()

// ----------------------------------------------------------

class QuadraticRegressor
{
  constructor(seed)
  {
    this.weights;     // allocated in train()
    this.bias;    // supplied in train()
    this.seed = seed + 0.5;  // avoid 0
  }

  // --------------------------------------------------------

  predict(x)
  {
    let dim = x.length;
    let result = 0.0;

    let p = 0; // ptr into wts
    for (let i = 0; i "lt" dim; ++i)   // regular
      result += x[i] * this.weights[p++];

    for (let i = 0; i "lt" dim; ++i)  // quadratic
      result += x[i] * x[i] * this.weights[p++];

    for (let i = 0; i "lt" dim-1; ++i)  // interactions
      for (let j = i+1; j "lt" dim; ++j)
        result += x[i] * x[j] * this.weights[p++]; 
 
    result += this.bias;
    return result;
  }

  // --------------------------------------------------------

  train(trainX, trainY, lrnRate, maxEpochs)
  {
    let nRows = trainX.length;
    let dim = trainX[0].length;    // number predictors
    let nInteractions = (dim * (dim - 1)) / 2;
    this.weights = vecMake(dim + dim + nInteractions, 0.0);


    let low = -0.01; let hi = 0.01;
    for (let i = 0; i "lt" dim; ++i)
      this.weights[i] = (hi - low) *
        this.next() + low;

    this.bias = (hi - low) *
      this.next() + low;

    let indices = vecMake(nRows, 0.0); // for shuffling
    for (let i = 0; i "lt" nRows; ++i)
      indices[i] = i;

    for (let epoch = 0; epoch "lt" maxEpochs; ++epoch) {
      // shuffle order of train data, Fisher-Yates
      let n = indices.length;
      for (let i = 0; i "lt" n; ++i) {
        let ri = this.nextInt(i, n);
        let tmp = indices[i];
        indices[i] = indices[ri];
        indices[ri] = tmp;
      }

      for (let i = 0; i "lt" nRows; ++i) {
        let ii = indices[i];
        let x = trainX[ii];
        let predY = this.predict(x);
        let actualY = trainY[ii];

        let p = 0; // points into weights
        // update regular weights
        for (let j = 0; j "lt" dim; ++j)
          this.weights[p++] -= lrnRate *
            (predY - actualY) * x[j];

        // update quadratic weights
        for (let j = 0; j "lt" dim; ++j)
          this.weights[p++] -= lrnRate *
            (predY - actualY) * x[j] * x[j];

        // update interaction weights
        for (let j = 0; j "lt" dim - 1; ++j)
          for (let k = j + 1; k "lt" dim; ++k)
             this.weights[p++] -= lrnRate *
              (predY - actualY) * x[j] * x[k];
 
        // update the bias
        this.bias -= lrnRate * (predY - actualY) * 1.0;
      }
      if (epoch % (maxEpochs / 5) == 0) // show progress
      {
        let mse = this.mse(trainX, trainY);
        let acc = this.accuracy(trainX, trainY, 0.10);
        let s1 = "epoch = " +
          epoch.toString().padStart(6, ' ');
        let s2 = "  MSE = " +
          mse.toFixed(4).toString();
        let s3 = "  acc = " + acc.toFixed(4).toString();
        console.log(s1 + s2 + s3);
      }
    } // epoch

    // apply wt decay ~ L2 regularization
    // for (let j = 0; j "lt" dim; ++j)
    //   this.wts[j] *= (1.0 - this.alpha);

  } // train()

  // --------------------------------------------------------

  accuracy(dataX, dataY, pctClose)
  {
    let nCorrect = 0; let nWrong = 0;
    let N = dataX.length;
    
    for (let i = 0; i "lt" N; ++i) {
      let x = dataX[i];
      let actualY = dataY[i];
      let predY = this.predict(x);
      if (Math.abs(predY - actualY) "lt" 
        Math.abs(pctClose * actualY)) {
        ++nCorrect;
      }
      else {
        ++nWrong;
      }
    }
    return (nCorrect * 1.0) / (nCorrect + nWrong);
  }

  // --------------------------------------------------------

  mse(dataX, dataY)
  {
    let n = dataX.length;
    let sum = 0.0;
    for (let i = 0; i "lt" n; ++i) {
      let x = dataX[i];
      let actualY = dataY[i];
      let predY = this.predict(x);
      sum += (actualY - predY) * (actualY - predY);
    }
    return sum / n;
  }

  // --------------------------------------------------------

  next()
  {
    // return sort-of-random in [0.0, 1.0)
    let x = Math.sin(this.seed) * 1000;
    let result = x - Math.floor(x);  // [0.0,1.0)
    this.seed = result;  // for next call
    return result;
  }

  // --------------------------------------------------------

  nextInt(lo, hi)
  {
    let x = this.next();
    return Math.trunc((hi - lo) * x + lo);
  }

  // --------------------------------------------------------

  shuffle(indices)
  {
    // Fisher-Yates
    for (let i = 0; i "lt" indices.length; ++i) {
      let ri = this.nextInt(i, indices.length);
      let tmp = indices[ri];
      indices[ri] = indices[i];
      indices[i] = tmp;
      //indices[i] = i; // for testing
    }
  }

} // end class LinearRegressor

// ----------------------------------------------------------
// vector and matrix functions
// ----------------------------------------------------------

function vecMake(n, val)
{
  let result = [];
  for (let i = 0; i "lt" n; ++i) {
    result[i] = val;
  }
  return result;
}

// ----------------------------------------------------------

function matMake(nRows, nCols, val)
{
  let result = [];
  for (let i = 0; i "lt" nRows; ++i) {
    result[i] = [];
    for (let j = 0; j "lt" nCols; ++j) {
      result[i][j] = val;
    }
  }
  return result;
}

// ----------------------------------------------------------

function vecShow(vec, dec, wid, nl)
{
  let small = 1.0 / Math.pow(10, dec);
  for (let i = 0; i "lt" vec.length; ++i) {
    let x = vec[i];
    if (Math.abs(x) "lt" small) x = 0.0  // avoid -0.00
    let xx = x.toFixed(dec);
    let s = xx.toString().padStart(wid, ' ');
    process.stdout.write(s);
    process.stdout.write(" ");
  }

  if (nl == true)
    process.stdout.write("\n");
}

// ----------------------------------------------------------

function matShow(A, dec, wid)
{
  let small = 1.0 / Math.pow(10, dec);
  let nr = A.length;
  let nc = A[0].length;
  for (let i = 0; i "lt" nr; ++i) {
    for (let j = 0; j "lt" nc; ++j) {
      let x = A[i][j];
      if (Math.abs(x) "lt" small) x = 0.0;
      let xx = x.toFixed(dec);
      let s = xx.toString().padStart(wid, ' ');
      process.stdout.write(s);
      process.stdout.write(" ");
    }
    process.stdout.write("\n");
  }
}

// ----------------------------------------------------------

function matToVec(M)
{
  let nr = M.length;
  let nc = M[0].length;
  let result = 	vecMake(nr*nc, 0.0);
  let k = 0;
  for (let i = 0; i "lt" nr; ++i) {
    for (let j = 0; j "lt" nc; ++j) {
      result[k++] = M[i][j];
    }
  }
  return result;
}

// ----------------------------------------------------------

function loadTxt(fn, delimit, usecols, comment)
{
  // efficient but mildly complicated
  let all = FS.readFileSync(fn, "utf8");  // giant string
  all = all.trim();  // strip final crlf in file
  let lines = all.split("\n");  // array of lines

  // count number non-comment lines
  let nRows = 0;
  for (let i = 0; i "lt" lines.length; ++i) {
    if (!lines[i].startsWith(comment))
      ++nRows;
  }
  let nCols = usecols.length;
  let result = matMake(nRows, nCols, 0.0); 
 
  let r = 0;  // into lines
  let i = 0;  // into result[][]
  while (r "lt" lines.length) {
    if (lines[r].startsWith(comment)) {
      ++r;  // next row
    }
    else {
      let tokens = lines[r].split(delimit);
      for (let j = 0; j "lt" nCols; ++j) {
        result[i][j] = parseFloat(tokens[usecols[j]]);
      }
      ++r;
      ++i;
    }
  }

  return result;
}

// ----------------------------------------------------------

function main()
{
  console.log("\nBegin quadratic regression " +
    "with SGD training using node.js JavaScript ");

  // 1. load data
  console.log("\nLoading train (200) and" +
    " test (40) from file ");

  let trainFile = ".\\Data\\synthetic_train_200.txt";
  let trainX = loadTxt(trainFile, ",", [0,1,2,3,4], "#");
  let trainY = loadTxt(trainFile, ",", [5], "#");
  trainY = matToVec(trainY);
  
  let testFile = ".\\Data\\synthetic_test_40.txt";
  let testX = loadTxt(testFile, ",", [0,1,2,3,4], "#");
  let testY = loadTxt(testFile, ",", [5], "#");
  testY = matToVec(testY);

  console.log("\nFirst three train X: ");
  for (let i = 0; i "lt" 3; ++i)
    vecShow(trainX[i], 4, 8, true);

  console.log("\nFirst three train y: ");
  for (let i = 0; i "lt" 3; ++i)
    console.log(trainY[i].toFixed(4).toString().
    padStart(9, ' '));

  // 2. create and train quadratic regression model
  let seed = 0;
  console.log("\nCreating quadratic regression model ");
  let model = new QuadraticRegressor(seed);

  let lrnRate = 0.001;
  let maxEpochs = 1000;
  console.log("\nSetting lrnRate = " +
    lrnRate.toFixed(3).toString());
  console.log("Setting maxEpochs = " +
    maxEpochs.toString());
  console.log("\nStarting SGD training ");
  model.train(trainX, trainY, lrnRate, maxEpochs);
  console.log("Done ");

  // 3. show model weights
  console.log("\nModel base weights: ");
  let dim = trainX[0].length;
  for (let i = 0; i "lt" dim; ++i)
    process.stdout.write(model.weights[i].toFixed(4).
    toString().padStart(8, ' '));
  console.log("");

  console.log("\nModel quadratic weights: ");
  for (let i = dim; i "lt" dim + dim; ++i)
    process.stdout.write(model.weights[i].toFixed(4).
    toString().padStart(8, ' '));
  console.log("");

  console.log("\nModel interaction weights: ");
  for (let i = dim + dim; i "lt" model.weights.length; ++i) {
    process.stdout.write(model.weights[i].toFixed(4).
    toString().padStart(8, ' '));
    if (i "gt" dim+dim && i % dim == 0)
      console.log("");
  }
  console.log("");

  console.log("\nModel bias: " + 
    model.bias.toFixed(4).toString());

  // 4. evaluate
  console.log("\nComputing model accuracy ");
  let trainAcc = model.accuracy(trainX, trainY, 0.10);
  let testAcc = model.accuracy(testX, testY, 0.10);

  console.log("\nTrain acc (within 0.10) = " +
    trainAcc.toFixed(4).toString());
  console.log("Test acc (within 0.10) = " +
    testAcc.toFixed(4).toString());

  let trainMSE = model.mse(trainX, trainY);
  let testMSE = model.mse(testX, testY);

  console.log("\nTrain MSE = " +
    trainMSE.toFixed(4).toString());
  console.log("Test MSE = " +
    testMSE.toFixed(4).toString());

  // 5. use model
  let x = trainX[0];
  console.log("\nPredicting for x = ");
  vecShow(x, 4, 9, true);  // add newline

  let predY = model.predict(x);
  console.log("Predicted y = " + 
    predY.toFixed(4).toString());

  console.log("\nEnd demo");
}

main();

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386
Posted in JavaScript, Machine Learning | Leave a comment

Singular Value Decomposition From Scratch Using Python – AI Stuns Me

Briefly, AI was able to generate from-scratch Python code for a function that computes the singular value decomposition (SVD) of a matrix — one of the most complicated functions I’ve come across in 40+ years of software engineering. The AI-generated code isn’t perfect, but the process is still impressive.

Where do I begin? I’ve looked at SVD for, literally decades. I learned early on that SVD is insanely complicated. It took dozens of mathematicians several decades of research to come up with a stable algorithm. The standard implementation (LAPACK) is over 10,000 lines of Fortran code.

I sat down one morning and fired up ChatGPT, and just for hoots, I asked it to generate from-scratch Python code for SVD. For the next several hours, ChatGPT gave me version after version that almost worked, but failed for various reasons. I told AI why the current version failed and it gave me a corrected version. Each new version got better and better.

The real point here is that each iteration from AI took only a few seconds to generate, instead of traditional development where each version would take hours or days.

In just a few hours, using AI, I was able to get a final working version of from-scratch SVD. Without AI, I estimate it would have taken me no less than 1,000 hours of full time effort. The code still has bugs but it’s close to correct.

Wow. Just Amazing.

The output of my demo:

Begin SVD demo

TALL MATRIX

A =
[[ 3  1  1]
 [-1  3  1]
 [ 0  2  2]
 [ 4  4 -2]]

U =
[[-0.39318649  0.10749453 -0.80336030]
 [-0.21213925 -0.74140087  0.30624047]
 [-0.18505920 -0.60915594 -0.35957963]
 [-0.87530247  0.26018976  0.36267270]]

s =
[6.67747474 3.91438086 2.46758051]

Vt =
[[-0.66920959 -0.73394999  0.11608592]
 [ 0.53766957 -0.58611081 -0.60612338]
 [-0.51290346  0.34320770 -0.78685355]]

reconstructed A
[[ 3.00000000  1.00000000  1.00000000]
 [-1.00000000  3.00000000  1.00000000]
 [ 0.00000000  2.00000000  2.00000000]
 [ 4.00000000  4.00000000 -2.00000000]]

====================

np.linalg.svd results:

U =
[[-0.39318649  0.10749453 -0.80336030]
 [-0.21213925 -0.74140087  0.30624047]
 [-0.18505920 -0.60915594 -0.35957963]
 [-0.87530247  0.26018976  0.36267270]]

s =
[6.67747474 3.91438086 2.46758051]

Vt =
[[-0.66920959 -0.73394999  0.11608592]
 [ 0.53766957 -0.58611081 -0.60612338]
 [-0.51290346  0.34320770 -0.78685355]]

reconstructed A
[[ 3.00000000  1.00000000  1.00000000]
 [-1.00000000  3.00000000  1.00000000]
 [ 0.00000000  2.00000000  2.00000000]
 [ 4.00000000  4.00000000 -2.00000000]]

WIDE MATRIX

A =
[[ 3  1  1  0  5]
 [-1  3  1 -2  4]
 [ 0  2  2  1 -3]]

U =
[[-0.72714903  0.19514593 -0.65815830]
 [-0.62191202 -0.59319511  0.51121913]
 [ 0.29065395 -0.78104906 -0.55270485]]

s =
[7.63921810 4.02386022 3.23278451]

Vt =
[[-0.20414852 -0.26332239 -0.10050154  0.20086846 -0.91571611]
 [ 0.29291099 -0.78196989 -0.48713106  0.10073440  0.23512158]
 [-0.76890186 -0.07111844 -0.38739015 -0.48724037  0.12750604]]

reconstructed A
[[ 3.00000000  1.00000000  1.00000000  0.00000000  5.00000000]
 [-1.00000000  3.00000000  1.00000000 -2.00000000  4.00000000]
 [ 0.00000000  2.00000000  2.00000000  1.00000000 -3.00000000]]

====================

np.linalg.svd results:

U =
[[-0.72714903 -0.19514593 -0.65815830]
 [-0.62191202  0.59319511  0.51121913]
 [ 0.29065395  0.78104906 -0.55270485]]

s =
[7.63921810 4.02386022 3.23278451]

Vt =
[[-0.20414852 -0.26332239 -0.10050154  0.20086846 -0.91571611]
 [-0.29291099  0.78196989  0.48713106 -0.10073440 -0.23512158]
 [-0.76890186 -0.07111844 -0.38739015 -0.48724037  0.12750604]]

reconstructed A
[[ 3.00000000  1.00000000  1.00000000 -0.00000000  5.00000000]
 [-1.00000000  3.00000000  1.00000000 -2.00000000  4.00000000]
 [-0.00000000  2.00000000  2.00000000  1.00000000 -3.00000000]]

End demo

The code is from-scratch Python, but it uses a lot of NumPy calls such as np.diag(), and lots of matrix and array slicing. Maybe I’ll refactor the code to eliminate those calls by writing helper functions such as my_diag() but the code is very complex, and such a refactoring would take a big effort.

This was one of the most amazing technical explorations I have experienced in my lifetime.



Life is funny. At my age (mid-70s), the road ahead is shorter than the trail behind me. But I still wake up each day with a lot of joy in my heart, due mostly to family and friends but also unexpected things like the discovery in this blog post.

My last name, McCaffrey, is relatively rare so anyone with the name must be closely related. I came across an amusing headstone for one of my relatives. If you read the first letter of each sentence on the backside, you’ll see the real message. Apparently, John was quite the philanderer, and his wife met his mistress after John’s departure from this “mortal coil”. Ouch. Vulgar messages on headstones are not allowed in most cemeteries, so the wife and the mistress composed a message to John. Ouch.


Demo code. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols (my blog editor often chokes on symbols).

# svd_from_scratch.py

import numpy as np

# -----------------------------------------------------------

def householder_qr(A):
  """
  Reduced QR decomposition using Householder reflections
  A: (m x n), m "gte" n
  Returns Q: (m x n), R: (n x n)
  """
  A = A.astype(float).copy()
  m, n = A.shape

  Q = np.eye(m)
  R = A.copy()

  for k in range(n):
    x = R[k:, k]
    normx = np.linalg.norm(x)
    if normx == 0:
      continue

    v = x.copy()
    sign = 1.0 if x[0] "gte" 0 else -1.0
    v[0] += sign * normx
    v /= np.linalg.norm(v)

    # Apply reflector to R
    R[k:, k:] -= 2 * np.outer(v, v @ R[k:, k:])

    # Accumulate Q
    Q[:, k:] -= 2 * np.outer(Q[:, k:] @ v, v)

  return Q[:, :n], R[:n, :]

# -----------------------------------------------------------

def symmetric_qr_eig(A, tol=1e-12, max_iter=2000):
  """
  Eigen-decomp of symmetric matrix using QR iteration
  Returns eigenvalues and eigenvectors.
  """
  A = A.astype(float).copy()
  n = A.shape[0]

  V = np.eye(n)

  for _ in range(max_iter):
    # QR decomposition (using our own QR)
    Q, R = householder_qr(A)
    A_new = R @ Q
    V = V @ Q

    # Check convergence (off-diagonal norm)
    off = A_new - np.diag(np.diag(A_new))
    if np.linalg.norm(off) "lt" tol:
      A = A_new
      break

    A = A_new

  eigvals = np.diag(A)
  return eigvals, V

# -----------------------------------------------------------

def svd_qr_eig_numpy_only(A, tol=1e-12):
  # SVD using only NumPy operations:

  A = np.asarray(A, float)
  m, n = A.shape

  if m "gte" n:
    # QR reduction
    Q, R = householder_qr(A)

    # Symmetric eigenproblem
    RtR = R.T @ R
    eigvals, V = symmetric_qr_eig(RtR, tol=tol)

    # Sort descending
    idx = np.argsort(eigvals)[::-1]
    eigvals = eigvals[idx]
    V = V[:, idx]

    S = np.sqrt(np.clip(eigvals, 0, None))

    nonzero = S "gt" tol
    U = Q @ (R @ V[:, nonzero] / S[nonzero])

    return U, S[nonzero], V[:, nonzero].T

  else:
    # Wide matrix: transpose trick
    U, S, Vt = svd_qr_eig_numpy_only(A.T, tol)
    return Vt.T, S, U.T

# ===========================================================

print("\nBegin SVD demo ")

print("\nTALL MATRIX ")

np.set_printoptions(precision = 8,
  suppress=True, floatmode='fixed')

A = np.array([[3, 1, 1],
              [-1, 3, 1],
              [0, 2, 2],
              [4, 4, -2]])

print("\nA = ");print(A)

# U, s, Vt = svd_qr_bidiagonal(A)
U, s, Vt = svd_qr_eig_numpy_only(A)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

A_reconstructed = U @ np.diag(s) @ Vt
print("\nreconstructed A"); print(A_reconstructed)

print("\n====================")

print("\nnp.linalg.svd results: ")
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

A_reconstructed = U @ np.diag(s) @ Vt
print("\nreconstructed A"); print(A_reconstructed)

# ===========================================================

print("\nWIDE MATRIX ")

A = np.array([[3, 1, 1, 0, 5],
              [-1, 3, 1, -2, 4],
              [0, 2, 2, 1, -3]])

print("\nA = ");print(A)

# U, s, Vt = svd_qr_bidiagonal(A)
U, s, Vt = svd_qr_eig_numpy_only(A)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

A_reconstructed = U @ np.diag(s) @ Vt
print("\nreconstructed A"); print(A_reconstructed)

print("\n====================")

print("\nnp.linalg.svd results: ")
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

A_reconstructed = U @ np.diag(s) @ Vt
print("\nreconstructed A"); print(A_reconstructed)

print("\nEnd demo ")
Posted in Machine Learning | Leave a comment

“Linear Regression with Pseudo-Inverse Training Using JavaScript” in Visual Studio Magazine

I wrote an article titled “Linear Regression with Pseudo-Inverse Training Using JavaScript” in the February 2026 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2026/02/02/linear-regression-with-pseudo-inverse-training-using-javascript.aspx.

The goal of a machine learning regression problem is to predict a single numeric value. For example, you might want to predict the bank account balance of an employee based on annual income, age, and years of work experience. There are roughly a dozen main regression techniques, such as nearest neighbors regression, kernel ridge regression, neural network regression, and gradient boosting regression. Linear regression is the most fundamental technique.

The form of a linear regression prediction equation is y’ = (w0 * x0) + (w1 * x1) + . . + (wn * xn) + b where y’ is the predicted value, the xi are predictor values, the wi are constants called model weights, and b is a constant called the bias. For example, y’ = predicted balance = (-0.54 * income) + (-0.38 * ago) + (0.17 * experience) + 0.62. Training the model is the process of finding the values of the weights and bias so that predicted y values are close to known correct target y values in a set of training data.

There are many different techniques to train a linear regression model. Three of the most common are 1.) stochastic gradient descent (SGD), 2.) relaxed Moore-Penrose pseudo-inverse, and 3.) pseudo-inverse via normal equations closed form training. My article explains how to implement the relaxed MP pseudo-inverse training technique.

The output of the article demo program is:

Linear regression with pseudo-inverse (QR-Householder)
 training using JavaScript.

Loading synthetic train (200) and test (40) data from file

First three train X:
 -0.1660   0.4406  -0.9998  -0.3953  -0.7065
  0.0776  -0.1616   0.3704  -0.5911   0.7562
 -0.9452   0.3409  -0.1654   0.1174  -0.7192

First three train y:
   0.4840
   0.1568
   0.8054

Creating and training model
Done

Model weights/coefficients:
 -0.2656   0.0333  -0.0454   0.0358  -0.1146
Model bias/intercept: 0.3619

Evaluating model

Train acc (within 0.10) = 0.4600
Test acc (within 0.10) = 0.6500

Train MSE = 0.0026
Test MSE = 0.0020

Predicting for x =
  -0.1660    0.4406   -0.9998   -0.3953   -0.7065
Predicted y = 0.5329

End demo

In theory, the weight values of a linear regression model can be solved using the equation w = inv(X) * y, where w is a vector of weights and the bias that you want to find, X is a “design matrix.” which is a matrix of training predictor data that has a leading column of 1.0 values prepended, inv() is the matrix inverse operation, * means matrix-to-vector multiplication, and y is a vector of the target y values. However, this equation won’t work because matrix inverse applies only to square matrices that have the same number of rows and columns, which is almost never the case in machine learning scenarios.

One work-around is to use what’s called the relaxed Moore-Penrose pseudo-inverse. The math equation is: w = pinv(X) * y. This technique works because the pseudo-inverse applies to any shape matrix. The technique is called relaxed because in a machine learning scenario, the pseudo-inverse implementation doesn’t have to fulfill all the mathematical requirements of true MP pseudo-inverse.

Implementing linear regression using relaxed MP pseudo-inverse training from scratch requires a bit of effort, but compared to using a library function such as those in the Python language scikit-learn library, it allows you to easily integrate a prediction model with other Web-oriented systems. And a from-scratch implementation also allows you to easily modify the system. For example, you can add error checking that is relevant to your particular problem scenario, or you can add diagnostic console.log() statements to monitor training.



Linear regression is a classical machine learning technique. People in tubes is a classical science fiction movie technique.

Left: In “Lifeforce” (1985), a space crew discovers a huge derelict spacecraft that contains bodies in angular tubes. The crew brings the bodies back to Earth. Bad idea. Space vampires. Pretty good movie. I give it my personal B grade.

Center: In “Osiris” (2025), a team of U.S. Special Forces members is mysteriously abducted. They awaken in some pods on an alien spaceship. The aliens are bad. I give this movie a C+ grade.

Right: In “The Creation of the Humanoids” (1962), in the 23rd century, after a nuclear war, society has become dependent on robots. There are extremist human groups and extremist robot groups. The movie is slow and chatty. Somedays I like the movie (grade B) and somedays not (grade C-).


Posted in JavaScript, Machine Learning | Leave a comment

NFL 2025 Season Week 24 – Zoltar Predictions Recap

Zoltar is my NFL football prediction computer program. It uses a neural network and a type of reinforcement learning. The NFL 2025 season ended on Sunday, February 8, 2026 when the Seattle Seahawks beat the New England Patriots by a score of 29-13 in Super Bowl LX (60).

Zoltar had a good, but not great, season. For the 272 NFL games played during the regular season (i.e., not including the 12 post-season games and the Super Bowl), Zoltar predictions were 46-30 (~60% accuracy) against the Las Vegas point spread. Including all 13 post-season games, Zoltar was 50-30 (~62% accuracy) against the spread.



This is what Zoltar looks like (2025 week #18 predictions)


Zoltar theoretically suggests betting when the Vegas line is “significantly” different from Zoltar’s prediction. For the beginning and end of the season, I use a relatively conservative threshold of 4.0 points difference. For the middle of the season, I use a relatively aggressive threshold of 3.0 points difference.

Theoretically, if you must bet $110 to win $100 (typical in Vegas) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better to take into account logistics and things like manual data entry errors.

Before the season started, I asked Zoltar to predict the number of wins each team would reach (out of the 17 regular season games). I figured I’d check those predictions against the actual number of wins by each of the 32 NFL teams:

              Predicted  Actual
      Team       Wins     Wins   Diff
---------------------------------------------------
      bears       5       11      +6
    bengals       8        6      -2
      bills      12       12       0
    broncos       9       14      +5

     browns       3        5      +2
 buccaneers      10        8      -2
  cardinals       7        3      -4
   chargers      11       11       0

     chiefs      13        6      -7
      colts       7        8      +1
    cowboys       7        7       0
   dolphins       7        7       0

     eagles      12       11      -1
    falcons       7        8      +1
fortyniners       7       12      +5
     giants       3        4      +1

    jaguars       5       13      +8
       jets       6        3      -3
      lions      13        9      -4
    packers      10        9      -1

   panthers       5        8      +3
   patriots       5       14      +9
    raiders       5        3      -2
       rams      10       12      +2

     ravens      11        9      -2
 commanders      11        5      -6
     saints       5        6      +1
   seahawks      10       14      +4

   steelers       9       10      +1
     texans       9       12      +3
     titans       3        3       0
    vikings      12        9      -3

Not too bad. Zoltar was correct within plus-or-minus 2 games for 18 of the 32 teams.

The two biggest positive surprises to Zoltar were the Jaguars (+8 wins) and Patriots (+9 wins).

The biggest negative surprises were the Chiefs (-7 wins) and Commanders (-6 wins).

Well, Zoltar is going into hibernation for the next seven months, until the 2026 season starts in September.



Left: My prediction system is named after the Zoltar fortune teller machine you can find in arcades. Zoltar machines often make appearances at conferences too.

Center and Right: I got a very cool mini-Zoltar machine for Christmas this year.


Posted in Zoltar | Leave a comment

Matrix Pseudo-Inverse Using Transpose and Cholesky Inverse with C#

In ordinary arithmetic, the inverse of 5.0 is 0.20 because 5.0 * 0.20 = 1.0. In matrix algebra, if you have a matrix A, it has an inverse Ainv if A * Ainv = I, where * is matrix multiplication and I is the identity matrix (1.0s on the diagonal, 0.0s off the diagonal).

A matrix inverse exists only if the source matrix A is square (same number of rows and columns). But in machine learning, there are scenarios where you want to compute the inverse of a non-square matrix. The most common technique is called the Moore-Penrose pseudo-inverse.

The standard way to compute the pseudo-inverse uses singular value decomposition (SVD), an extremely complicated algorithm. I decided to explore an alternative technique to compute the pseudo-inverse which uses matrix inverse using Cholesky decomposition — still extremely complicated but significantly easier than SVD. The technique is usually called pseudo-inverse via normal equations, or sometimes the left pseudo-inverse. Important: This technique does not compute a full rigorous pseudo-inverse, which has several mathematical conditions. The technique is used in machine learning scenarios where the number of rows of the source matrix is greater than or equal to the number of columns.

If the source matrix is A, the pseudo-inverse can be computed as inv(At * A) * At where At s the transpose of A, inv() is standard matrix inverse, and * is matrix multiplication. This technique works because At * A is a square matrix and so standard matrix inverse can be applied.

The inv() part can be any one of the over a dozen matrix inverse algorithms, each of which has many variations. But At * A has a special form called symmetric positive-definite (if a small constant is added to the diagonal elements of At * A). For such a matrix, an algorithm called Cholesky decomposition can be used to compute the inverse. There are different ways to compute Cholesky decomposition; I prefer the Banachiewicz algorithm.

To recap, one way to compute the pseudo-inverse of a matrix is called the Moore-Penrose pseudo-inverse, which is usually computed using the blisteringly complex SVD algorithm. Another was to compute a matrix pseudo-inverse of a matrix A is called pseudo-inverse via normal equations, and it can be computed as inv(At * A) * At where the inv() can be Cholesky inverse.

The disadvantage of of the simpler normal equations approach is that At * A might be a huge matrix, in which case inv(At * A) might be too difficult.

I put together a demo using the C# language. The output of my demo is:

Begin pseudo-inverse using Cholesky

Source matrix:
  1.0  4.0  2.0
  6.0  0.0  3.0
  7.0  2.0  1.0
  5.0  9.0  8.0

Moore-Penrose pseudo-inverse:
   0.0047   0.0370   0.1331  -0.0317
   0.1306  -0.1946   0.1310   0.0239
  -0.1158   0.2113  -0.2393   0.1046

I verified the result using the Python language NumPy library linalg.pinv() function.

The pseudo-inverse function code is deceptively simple looking because there are lots of helper functions, and the MatInverseCholesky() funtion is especially complicated.

static double[][] MatPseudoInvCholesky(double[][] A)
{
  // inv(At * A) * At
  double[][] At = MatTranspose(A);
  double[][] AtA = MatProduct(At, A);  // is nxn
  for (int i = 0; i "lt" AtA.Length; ++i) // less-than
    AtA[i][i] += 1.0e-8; // condition the diagonal
  double[][] AtAinv = MatInverseCholesky(AtA);
  double[][] result = MatProduct(AtAinv, At);
  return result;

  // six nested helper functions go here
}

Good fun.



I wrote this blog post in December 2025, and then stashed it away until posting it. December 2025 was the 100th anniversary of one of the most famous chess tournaments in history — Moscow 1925.

The tournament had 21 players including many of the most famous of all time who essentially created modern chess: current world champion JR Capablanca, former champion Emanuel Lasker, US champion Frank J Marshall, Richard Reti, Ernst Gruenfeld, Rudolf Spielmann, Akiba Rubinstein, Friedrich Samish. The only top player missing was future world champion Alexander Alekhine, who fled the Soviet Union in 1921 and was a persona non grata there.

The tournament was won by Soviet master Efim Bogoljubov who was at the peak of his powers. Bogoljubov played two matches for the World Chess Championship, losing to Alexander Alekhine in 1929 and 1934.

A lot of the men are holding canes — in the 1920s, canes were both a fashion statement and an effective means of self-defense.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols (my blog editor often chokes on symbols).

using System;
using System.IO;
using System.Collections.Generic;

// Moore-Penrose pseudo-inverse using inv(At * A) * At
// where inv() is mattrix inverse using Cholesky decomp

namespace MatrixPseudoInverseClosedCholesky
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin pseudo-inverse using " +
        "Cholesky ");

      double[][] A = new double[4][];
      A[0] = new double[] { 1.0, 4.0, 2.0 };
      A[1] = new double[] { 6.0, 0.0, 3.0 };
      A[2] = new double[] { 7.0, 2.0, 1.0 };
      A[3] = new double[] { 5.0, 9.0, 8.0 };

      Console.WriteLine("\nSource matrix: ");
      MatShow(A, 1, 5);

      double[][] Apinv = MatPseudoInvCholesky(A);
      Console.WriteLine("\nMoore-Penrose pseudo-inverse: ");
      MatShow(Apinv, 4, 9);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    } // Main()

    static void MatShow(double[][] m, int dec, int wid)
    {
      for (int i = 0; i "lt" m.Length; ++i)
      {
        for (int j = 0; j "lt" m[0].Length; ++j)
        {
          double v = m[i][j];
          Console.Write(v.ToString("F" + dec).PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    static double[][] MatPseudoInvCholesky(double[][] A)
    {
      // inv(At * A) * At
      double[][] At = MatTranspose(A);
      double[][] AtA = MatProduct(At, A);  // is nxn
      for (int i = 0; i "lt" AtA.Length; ++i)
        AtA[i][i] += 1.0e-8; // condition the diagonal
      double[][] AtAinv = MatInverseCholesky(AtA);
      double[][] result = MatProduct(AtAinv, At);
      return result;

      // nested helpers: MatMake(), MatTranspose(),
      // MatProduct(), MatDecompCholesky(),
      // MatInverseCholesky(), MatIdentity()

      static double[][] MatMake(int nr, int nc)
      {
        double[][] result = new double[nr][];
        for (int i = 0; i "lt" nr; ++i)
          result[i] = new double[nc];
        return result;
      }

      static double[][] MatTranspose(double[][] M)
      {
        int nr = M.Length; int nc = M[0].Length;
        double[][] result = MatMake(nc, nr);  // reversed
        for (int i = 0; i "lt" nr; ++i)
          for (int j = 0; j "lt" nc; ++j)
            result[j][i] = M[i][j];
        return result;
      }

      static double[][] MatProduct(double[][] A,
        double[][] B)
      {
        int aRows = A.Length;
        int aCols = A[0].Length;
        int bRows = B.Length;
        int bCols = B[0].Length;
        if (aCols != bRows)
          throw new Exception("Non-conformable matrices");

        double[][] result = MatMake(aRows, bCols);

        for (int i = 0; i "lt" aRows; ++i) // each row of A
          for (int j = 0; j "lt" bCols; ++j) // each col of B
            for (int k = 0; k "lt" aCols; ++k)
              result[i][j] += A[i][k] * B[k][j];

        return result;
      }

      static double[][] MatDecompCholesky(double[][] M)
      {
        // Cholesky decomposition (Banachiewicz algorithm)
        // M is nxn square, symmetric, positive definite
        int n = M.Length;
        double[][] result = MatMake(n, n);  // all 0.0
        for (int i = 0; i "lt" n; ++i)
        {
          for (int j = 0; j "lte" i; ++j)
          {
            double sum = 0.0;
            for (int k = 0; k "lt" j; ++k)
              sum += result[i][k] * result[j][k];
            if (i == j)
            {
              double tmp = M[i][i] - sum;
              if (tmp "lt" 0.0)
                throw new
                  Exception("MatDecompCholesky fatal");
              result[i][j] = Math.Sqrt(tmp);
            }
            else
            {
              if (result[j][j] == 0.0)
                throw new
                  Exception("MatDecompCholesky fatal ");
              result[i][j] =
                (1.0 / result[j][j] * (M[i][j] - sum));
            }
          } // j
        } // i
        return result;
      } // MatDecompCholesky

      static double[][] MatInverseCholesky(double[][] M)
      {
        double[][] L = MatDecompCholesky(M);  // lower tri
        int n = L.Length;
        double[][] result = MatIdentity(n);

        for (int k = 0; k "lt" n; ++k)
        {
          for (int j = 0; j "lt" n; j++)
          {
            for (int i = 0; i "lt" k; i++)
            {
              result[k][j] -= result[i][j] * L[k][i];
            }
            result[k][j] /= L[k][k];
          }
        }

        for (int k = n - 1; k "gte" 0; --k)
        {
          for (int j = 0; j "lt" n; j++)
          {
            for (int i = k + 1; i "lt" n; i++)
            {
              result[k][j] -= result[i][j] * L[i][k];
            }
            result[k][j] /= L[k][k];
          }
        }
        return result;
      }

      static double[][] MatIdentity(int n)
      {
        double[][] result = MatMake(n, n);
        for (int i = 0; i "lt" n; ++i)
          result[i][i] = 1.0;
        return result;
      }

    } // MatPseudoInvCholesky

  } // class Program

} // ns
Posted in Machine Learning | Leave a comment

A Relatively Simple Approximation to Singular Value Decomposition Using QR Decomposition With C#

Before I start: The technique presented in this post is only an investigation to compute SVD. The algorithm presented here is (possibly) not numerically stable, and is (probably) not suitable when high accuracy (beyond 8 decimals) is needed. And the algorithm is very slow compared to standard SVD algorithms. That said, for small matrices, the technique for SVD presented here is the simplest I’ve seen (but by no means simple) and appears to work well.

The problem is singular value decomposition (SVD). If you have a matrix A and decompose it using SVD, you get a matrix U, a vector s, and a matrix Vh, where A = U * S * Vh. The S is a square matrix with the s values on the diagonal. SVD has many important uses in machine learning, for example, computing a pseudo-inverse for training a linear model.

Unfortunately, accurately computing SVD is incredibly difficult. Many mathematicians worked for many years on SVD and there are dozens of solutions, each with pros and cons. Two common SVD algorithms are Jacobi and Golub-Kahan-Reinsch. Both algorithms are very complex. But it is possible to compute an approximation to SVD. Briefly, to approximate the SVD of a source matrix A, first compute the QR decomposition of A to get Q and R, and then compute the SVD of the R matrix.

This seems silly at first — using SVD to compute SVD — but the R result matrix of a QR decomposition is a square matrix, and it’s possible to find its SVD using QR in a relatively simple way . In high-level pseudo-code:

def my_svd(A):
  Q, R = my_qr(A, mode='reduced') # apply QR decomp
  U, s, Vh = my_svd_square(R) # apply QR decomp again
  U = np.matmul(Q, U)
  return U, s, Vh

The SVD function described above isn’t trivial, but it’s vastly less complicated than computing SVD using Jacobi or any other algorithm. The my_svd_square() helper function for a square matrix is also much simpler than computing SVD for a non-square matrix. In essence, the idea for approximating SVD is to break it down into two much simpler (but by no means easy) sub-problems that can be solved using QR decomposition.

A few days ago, I implemented the idea using Python. Just for fun, one rainy morning before work, I decided to refactor the demo code to the C# language. The output of a demo run:

Begin SVD using QR demo with C#

Source matrix A:
   1.0000   6.0000   8.0000   5.0000
   0.0000   2.0000   4.0000   6.0000
   9.0000   3.0000   7.0000   7.0000
  -1.0000  -3.0000  -5.0000  -7.0000
   7.0000   5.0000   3.0000   1.0000

Computing SVD using QR
Exited at iteration 20

U =
   0.490569   0.357946  -0.658897  -0.443928
   0.309393   0.418534   0.228973   0.339519
   0.618658  -0.402788   0.526177  -0.422092
  -0.402797  -0.389987  -0.198233  -0.465342
   0.344434  -0.618365  -0.444148   0.541249

s =
  21.095067   8.254694   4.718203   1.611445

Vh =
   0.420588   0.395767   0.594453   0.559554
  -0.872923  -0.017625   0.219636   0.435263
   0.247105  -0.750910  -0.214769   0.573539
   0.007029   0.528386  -0.743142   0.410485

Checking U * S * Vh:
   1.0000   6.0000   8.0000   5.0000
   0.0000   2.0000   4.0000   6.0000
   9.0000   3.0000   7.0000   7.0000
  -1.0000  -3.0000  -5.0000  -7.0000
   7.0000   5.0000   3.0000   1.0000

End demo

I verified my results using the NumPy library np.linalg.svd() function and got the same results — essentially. One of the hugely annoying details with SVD is that the sign of the values in the columns of U, and in the rows of Vh are arbitrary. Singular values by convention are always positive or zero.

Well, all in all, this was a very challenging and very interesting exploration.



The history of machine learning is filled with incremental innovations. The history of pinball machines is filled with incremental innovations. A major advance occurred in 1953 when “Army Navy” by Williams introduced the first automatic reel scoring. Before this, fixed scores (1000 2000 etc.) were printed on the back glass and were illuminated as a player reached a point threshold. Somewhat surprisingly, the reel scoring feature did not catch on until a few years later when a high score reel was added.


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols — my blog editor chokes on symbols.

using System;
using System.IO;
using System.Collections.Generic;

namespace Matrix_SVD_Using_QR
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin SVD using QR" +
        " demo with C# ");

      double[][] A = new double[5][];
      A[0] = new double[] { 1, 6, 8, 5 };
      A[1] = new double[] { 0, 2, 4, 6 };
      A[2] = new double[] { 9, 3, 7, 7 };
      A[3] = new double[] { -1, -3, -5, -7 };
      A[4] = new double[] { 7, 5, 3, 1 };

      Console.WriteLine("\nSource matrix A: ");
      MatShow(A, 4, 9);

      Console.WriteLine("\nComputing SVD using QR ");
      double[][] U; double[] s; double[][] Vh;
      MatSVD(A, out U, out s, out Vh);
      
      Console.WriteLine("\nU = ");
      MatShow(U, 6, 11);

      Console.WriteLine("\ns = ");
      VecShow(s, 6, 11);

      Console.WriteLine("\nVh = ");
      MatShow(Vh, 6, 11);

      Console.WriteLine("\nChecking U * S * Vh: ");
      double[][] S = MatMake(s.Length, s.Length);
      for (int i = 0; i "lt" s.Length; ++i)
        S[i][i] = s[i];
      double[][] C = MatProduct(U, MatProduct(S, Vh));
      MatShow(C, 4, 9);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    }

    // ------------------------------------------------------

    private static void MatSVD(double[][] A,
      out double[][] U, out double[] s, out double[][] Vh)
    {
      // an approximation for SVD using QR
      // not practical for anything other than small matrices
      double[][] Q; double[][] R;
      MatDecomposeQR(A, out Q, out R, reduced:true);
      double[][] UU; double[] ss; double[][] VVh; // working
      MatSVD_Square(R, out UU, out ss, out VVh);
      UU = MatProduct(Q, UU);
      U = UU; s = ss; Vh = VVh;
    }

    // ------------------------------------------------------

    private static void MatSVD_Square(double[][] A,
      out double[][] U, out double[] s, out double[][] Vh)
    {
      // SVD for a square matrix using iterative QR
      int m = A.Length; int n = A[0].Length;
      if (m != n)
        throw new Exception("MatSVD_Square accepts" +
          " only square matrix ");

      int maxIter = 1000;
      double[][] R1 = MatCopy(A);
      double[][] UU = MatIdentity(m);
      double[][] VV = MatIdentity(m);

      double[][] Q1; double[][] Q2; double[][] R2;
      for (int i = 0; i "lt" maxIter; ++i)
      {
        MatDecomposeQR(R1, out Q1, out R1, reduced:true);
        MatDecomposeQR(MatTranspose(R1),
          out Q2, out R2, reduced:true);
        R1 = MatTranspose(R2);
        UU = MatProduct(UU, Q1);
        VV = MatProduct(VV, Q2);

        // exit when off-diag cells of R1 are all close to 0
        if (i % 10 == 0)
        {
          if (MatOffDiagAllZero(R1, 1.0e-8) == true)
          {
            Console.WriteLine("Exited at iteration " + i);
            break;
          }
        }
      } // i
      
      // deal with negative singular values (in R1)
      double[] ss = new double[m];
      for (int j = 0; j "lt" m; ++j)
        ss[j] = R1[j][j];
      double[][] VVh = MatTranspose(VV);
      for (int i = 0; i "lt" m; ++i)
      {
        if (ss[i] "lt" 0.0)
        {
          ss[i] = -ss[i]; // sing vals must be non-neg
          // adjust row of VVh
          for (int j = 0; j "lt" m; ++j)
            VVh[i][j] = -VVh[i][j];
        }
      }

      U = UU; s = ss; Vh = VVh;
    } // MatSVD_Square()

    // ------------------------------------------------------

    private static bool MatOffDiagAllZero(double[][] A,
      double tol)
    {
      int nr = A.Length; int nc = A[0].Length;
      for (int i = 0; i "lt" nr; ++i)
      {
        for (int j = 0; j "lt" nc; ++j)
        {
          if (i == j) continue; // no check diag
          if (Math.Abs(A[i][j]) "gt" tol)
            return false;
        }
      }
      return true;
    }

    // ------------------------------------------------------

    private static void MatDecomposeQR(double[][] M,
      out double[][] Q, out double[][] R, bool reduced)
    {
      // QR decomposition, Householder algorithm.
      // see rosettacode.org/wiki/QR_decomposition
      int m = M.Length;
      int n = M[0].Length;

      if (m "lt" n)
        throw new Exception("No rows less than cols");

      double[][] QQ = MatIdentity(m); // working Q
      double[][] RR = MatCopy(M); // working R

      int end;
      if (m == n) end = n - 1;
      else end = n;

      for (int i = 0; i "lt" end; ++i)
      {
        double[][] H = MatIdentity(m);
        double[] a = new double[m - i]; // corr
        int k = 0;
        for (int ii = i; ii "lt" m; ++ii) // corr
          a[k++] = RR[ii][i];

        double normA = VecNorm(a);
        if (a[0] "lt" 0.0 && normA "gt" 0.0) // corr
          normA = -normA;
        else if (a[0] "gt" 0.0 && normA "lt" 0.0)
          normA = -normA;

        double[] v = new double[a.Length];
        for (int j = 0; j "lt" v.Length; ++j)
          v[j] = a[j] / (a[0] + normA);
        v[0] = 1.0;

        // Householder algorithm
        double[][] h = MatIdentity(a.Length);
        double vvDot = VecDot(v, v);
        double[][] A = VecToMat(v, v.Length, 1);
        double[][] B = VecToMat(v, 1, v.Length);
        double[][] AB = MatProduct(A, B);

        for (int ii = 0; ii "lt" h.Length; ++ii)
          for (int jj = 0; jj "lt" h[0].Length; ++jj)
            h[ii][jj] -= (2.0 / vvDot) * AB[ii][jj];

        // copy h[][] into lower right corner of H[][]
        int d = m - h.Length; // corr
        for (int ii = 0; ii "lt" h.Length; ++ii)
          for (int jj = 0; jj "lt" h[0].Length; ++jj)
            H[ii + d][jj + d] = h[ii][jj];

        QQ = MatProduct(QQ, H);
        RR = MatProduct(H, RR);
      } // i

      if (reduced == false)
      {
        Q = QQ; // working results into the out params
        R = RR;
        return;
      }
      //else if (reduced == true)
      {
        int qRows = QQ.Length; int qCols = QQ[0].Length;
        int rRows = RR.Length; int rCols = RR[0].Length;
        // assumes m "gte" n !!

        // square-up R
        int dim = Math.Min(rRows, rCols);
        double[][] Rsquared = MatMake(dim, dim);
        for (int i = 0; i "lt" dim; ++i)
          for (int j = 0; j "lt" dim; ++j)
            Rsquared[i][j] = RR[i][j];

        // Q needs same number columns as R
        // so that inv(R) * trans(Q) works
        double[][] Qtrimmed = MatMake(qRows, dim);
        for (int i = 0; i "lt" qRows; ++i)
          for (int j = 0; j "lt" dim; ++j)
            Qtrimmed[i][j] = QQ[i][j];

        Q = Qtrimmed;
        R = Rsquared;
        return;
      }
    } // MatDecomposeQR()

     // ------------------------------------------------------

    private static double[][] MatCopy(double[][] M)
    {
      int nr = M.Length; int nc = M[0].Length;
      double[][] result = MatMake(nr, nc);
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[i][j] = M[i][j];
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatTranspose(double[][] m)
    {
      int nr = m.Length;
      int nc = m[0].Length;
      double[][] result = MatMake(nc, nr);  // note
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[j][i] = m[i][j];
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatMake(int nRows, int nCols)
    {
      double[][] result = new double[nRows][];
      for (int i = 0; i "lt" nRows; ++i)
        result[i] = new double[nCols];
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatIdentity(int n)
    {
      double[][] result = MatMake(n, n);
      for (int i = 0; i "lt" n; ++i)
        result[i][i] = 1.0;
      return result;
    }

    // ------------------------------------------------------

    private static double[][] MatProduct(double[][] matA,
      double[][] matB)
    {
      int aRows = matA.Length;
      int aCols = matA[0].Length;
      int bRows = matB.Length;
      int bCols = matB[0].Length;
      if (aCols != bRows)
        throw new Exception("Non-conformable matrices");

      double[][] result = MatMake(aRows, bCols);

      for (int i = 0; i "lt" aRows; ++i) // each row of A
        for (int j = 0; j "lt" bCols; ++j) // each col of B
          for (int k = 0; k "lt" aCols; ++k)
            result[i][j] += matA[i][k] * matB[k][j];

      return result;
    }

    // ------------------------------------------------------

    private static double VecNorm(double[] vec)
    {
      int n = vec.Length;
      double sum = 0.0;
      for (int i = 0; i "lt" n; ++i)
        sum += vec[i] * vec[i];
      return Math.Sqrt(sum);
    }

    // ------------------------------------------------------

    private static double[][] VecToMat(double[] vec,
        int nRows, int nCols)
    {
      double[][] result = MatMake(nRows, nCols);
      int k = 0;
      for (int i = 0; i "lt" nRows; ++i)
        for (int j = 0; j "lt" nCols; ++j)
          result[i][j] = vec[k++];
      return result;
    }

    // ------------------------------------------------------

    private static double VecDot(double[] v1, double[] v2)
    {
      double result = 0.0;
      int n = v1.Length;
      for (int i = 0; i "lt" n; ++i)
        result += v1[i] * v2[i];
      return result;
    }

    // ------------------------------------------------------

    private static void MatShow(double[][] M, int dec,
      int wid)
    {
      for (int i = 0; i "lt" M.Length; ++i)
      {
        for (int j = 0; j "lt" M[0].Length; ++j)
        {
          double v = M[i][j];
          Console.Write(v.ToString("F" + dec).
            PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    // ------------------------------------------------------

    private static void VecShow(double[] vec,
      int dec, int wid)
    {
      for (int i = 0; i "lt" vec.Length; ++i)
        Console.Write(vec[i].ToString("F" + dec).
          PadLeft(wid));
      Console.WriteLine("");
    }

  } // Program
} // ns
Posted in Machine Learning | Leave a comment

Example of Kernel Ridge Regression Using the scikit-learn Library

I was working on a project where I was implementing kernel ridge regression (KRR) from scratch, using the C# language. KRR is tricky so I figured I’d run my demo data through the scikit-learn KernelRidge module so I could verify my C# implementation results.

KRR is difficult to explain. KRR uses a kernel function that compares two vectors and gives a measure of similarity that’s between 0.0 (no similarity and 1.0 (vectors are the same). The most common kernel function is the radial basis function (RBF). The RBF kernel requires a value for a parameter gamma (and confusingly, there’s a different version of RBF that requires a value for a parameter sigma). All KRR models, regardless of the kernel function used, require a small alpha constant value that deters overfitting. The kernel gamma and model alpha must be determined by trial and error.

A KRR model has one weight per training item. My demo data has 200 training items so there are 200 weights. To make a prediction for an input vector x, KRR computes a weighted sum of the kernel function applied to x and every training item.

My demo data is synthetic. It looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
. . .

The first five values on each line are predictors. The last value on each line is the target y value to predict. There are 200 training items and 40 test items.

The output of the scikit KRR demo is:

Begin kernel ridge regression using scikit

Loading train (200) and test (40) data

Creating KRR model RBF gamma = 0.3, alpha = 0.005
Done

Model weights:
[-2.0218 -1.1406  0.0758 -0.6265  0.5722 -0.9905  0.6912
  0.4807  0.6496 -0.7364  1.2163 -0.5387  1.4276  0.0517
  1.0412  0.3933  0.2223  0.0564  0.4282  0.2780  0.9860
 -0.5630 -1.4695  0.1453 -1.2380  0.0059 -1.3257 -0.8750
  0.3530 -0.3127  0.7831 -0.0176  3.1105 -0.3441 -0.1874
 -0.9381 -1.0774 -1.5558 -1.6556  0.6567 -0.8531  1.4415
 -1.8095 -1.7655  0.3510 -0.7398 -1.6442  0.3615 -1.6873
  0.5667 -0.8786 -0.1819 -0.1024 -0.5156  0.5875 -0.1200
  1.7779 -0.7825 -1.5794 -0.0976  0.7271 -0.1227 -0.7527
  0.2934 -0.5738 -1.2267 -1.0484  0.7964 -0.3300 -1.5897
 -0.3434 -0.5869  0.5613  1.8904 -0.1202  1.1940 -1.0744
  0.0246 -0.5136 -0.2046  0.1461  1.7474  0.6251  0.8854
  0.9305  1.7007 -0.0030 -1.8766  1.1218 -1.6448  1.2075
  2.8511  3.5556 -1.0008 -1.3832  0.0897 -0.8997 -0.4007
 -0.5953 -0.5032  0.8405 -1.5877  0.6787 -0.5144  0.0824
 -0.0346 -0.6607  0.0076  0.0406  3.6744 -0.1402 -0.5089
 -1.4806  0.3451 -0.2460 -0.9033 -0.1536  0.7088  0.2355
  0.3538 -0.8437  0.1532  0.8644  1.9891  2.3591 -0.8606
 -0.8545  1.2116  0.5442 -0.3250 -0.4847 -0.6029 -0.0872
  1.5692 -0.2556 -1.4743  1.0171  3.7093  0.4378 -0.8627
  1.8371  0.3855  0.0130 -2.0406  0.9217  0.1018 -2.5532
 -1.0969  1.2241  0.2609 -0.1100 -0.1464  1.6083 -1.2871
  0.8604  0.5754 -0.3649  1.2258  0.4378 -2.9902  0.0453
  0.3650 -0.5867 -1.8827  1.3220  0.6443 -0.6158  0.7165
 -0.1986  1.5692  1.1419  0.6714  0.4306 -0.3582  1.1262
 -0.1465  0.3204 -0.3066 -0.3255  1.9478  2.0118 -1.3593
 -0.7194  2.7442 -0.4100 -1.3653 -2.0545 -0.8568 -0.7523
  1.9512  0.3181 -0.1053 -0.8341  0.8603  0.4775 -0.2014
 -1.6270 -0.5825 -0.0487  1.2897]

Accuracy (withhin 0.10) train = 0.9950
Accuracy (withhin 0.10) test = 0.9500

MSE train = 0.0000
MSE test = 0.0002

Predicting for x =
[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]
Predicted y = 0.4941

End demo

I implemented an accuracy() function that scores a prediction as correct if it’s within 10% of the true target value. The accuracy and MSE values for the KRR model are very good — among the best I’ve seen for the synthetic dataset I used.



Machine learning regression techniques are black-and-white in the sense that they’re objective. And I find great beauty in regression models. But I don’t understand subjective things like fashion photography. Here are three images from the Internet. They’re all the same model but I can’t evaluate if the photos are nice or not. By the way, I actually know the model who posed for these photos, and she knows quite a bit about machine learning.


Demo program. Replace “lt” (less than) in the accuracy() function with Boolean operator symbol.

# krr_scikit.py
# kernel ridge regression demo

import numpy as np
from sklearn.kernel_ridge import KernelRidge

np.set_printoptions(precision=4, suppress=True,
  floatmode='fixed', linewidth=60)

# -----------------------------------------------------------

def accuracy(model, data_X, data_y, pct_close):
  n = len(data_X)
  n_correct = 0; n_wrong = 0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)

    if np.abs(y - y_pred) "lt" np.abs(y * pct_close):
      n_correct += 1
    else: 
      n_wrong += 1
  # print("Correct = " + str(n_correct))
  # print("Wrong   = " + str(n_wrong))
  return n_correct / (n_correct + n_wrong)

def mse(model, data_X, data_y):
  n = len(data_X)
  sum = 0.0
  for i in range(n):
    actual_y = data_y[i]
    pred_y = model.predict(data_X[i].reshape(1, -1))
    diff = actual_y - pred_y
    sum += diff * diff
  return sum /n

# -----------------------------------------------------------

print("\nBegin kernel ridge regression using scikit ")

print("\nLoading train (200) and test (40) data ")
train_X = np.loadtxt(".\\Data\\synthetic_train_200.txt",
  usecols=[0,1,2,3,4], delimiter=",")
train_y = np.loadtxt(".\\Data\\synthetic_train_200.txt",
  usecols=5, delimiter=",")
test_X = np.loadtxt(".\\Data\\synthetic_test_40.txt",
  usecols=[0,1,2,3,4], delimiter=",")
test_y = np.loadtxt(".\\Data\\synthetic_test_40.txt",
  usecols=5, delimiter=",")

print("\nCreating KRR model RBF gamma = 0.3, alpha = 0.005 ")
krr = KernelRidge(alpha=0.005, kernel='rbf', gamma=0.3)
krr.fit(train_X, train_y)
print("Done ")

print("\nModel weights: ")
print(krr.dual_coef_)

acc_train = accuracy(krr, train_X, train_y, 0.10)
acc_test = accuracy(krr, test_X, test_y, 0.10)
print("\nAccuracy (withhin 0.10) train = %0.4f " % \
  acc_train)
print("Accuracy (withhin 0.10) test = %0.4f " % \
  acc_test)

mse_train = mse(krr, train_X, train_y)
mse_test = mse(krr, test_X, test_y)
print("\nMSE train = %0.4f " % mse_train)
print("MSE test = %0.4f " % mse_test)

x = train_X[0]
print("\nPredicting for x =")
print(x)
pred_y = krr.predict(x.reshape(1,-1))
print("Predicted y = %0.4f " % pred_y)

print("\nEnd demo ")

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386
Posted in Machine Learning, Scikit | Leave a comment