Implementing a StandardScaler for Machine Learning Regression Using C#

I rarely use L2 regularization for linear regression (“ridge regression”) and I never use L1 regularization for linear regression (“lasso regression”). But, in the rare situations when I use L2 ridge regression, it’s necessary (in theory) to scale the training and test X predictor data so that all model weights are treated equally by L2 weight reduction.

One morning I realized it had been many months since I used a C# standard scaler, so I implemented one. A standard scaler converts each column of training data into a new column where the values have a mean of 0.0 (“centered”) and variance of 1.0 (“unit variance”).

This is done using the computation xij’ = (xij – u) / sd, where xij’ is the scaled value at row i col j, xij is the unscaled value at row i col j, u is the mean of the column values, sd is the (population) standard deviation of the column values.

The output of my demo:

Begin StandardScaler with C# demo

Loading synthetic train (50) data

First three train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

First three train y:
  0.4840
  0.1568
  0.8054

Creating standard scaler for trainX
Done

Column means:
 -0.0217  0.0201 -0.0283  0.0532  0.1676
Column variances:
  0.3665  0.3349  0.3905  0.2815  0.3713

Transforming data
Done

First three scaled train X:
 -0.2383  0.7266 -1.5546 -0.8454 -1.4346
  0.1640 -0.3140  0.6380 -1.2145  0.9661
 -1.5254  0.5543 -0.2194  0.1209 -1.4554

Unscaling scaled data

First three original train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

End demo

I used the scikit-learn library StandardScaler as a design guide for my C# class. This design stores column variances rather than column standard deviations — I’m not exactly sure why, but it’s not important.

The key calling statements are:

// read trainX matrix from file
StandardScaler scaler = new StandardScaler();
scaler.Fit(trainX); 
double[][] scaledTrainX = scaler.Transform(trainX);

In a non-demo scenario, you’d also scale the test data using the scaler that was fitted to the test data, so that there’s no data leakage.

Good fun.



Galaxy Science Fiction was published from 1950 to 1980. It was the leading science fiction magazine of its time. The magazine had a standardized cover format with beautiful art by a few dozen different artists. Here are three nice covers by artist Mel Hunter (1927-2004).


Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. My blog editor chokes on symbols.

using System;
using System.Collections.Generic;
using System.IO;

namespace Scaler
{
  internal class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin StandardScaler" +
        " with C# demo ");

      Console.WriteLine("\nLoading synthetic train" +
        " (50) data");
      string trainFile =
        "..\\..\\..\\Data\\synthetic_train_50.txt";
      int[] colsX = new int[] { 0, 1, 2, 3, 4 };
      double[][] trainX =
        MatLoad(trainFile, colsX, ',', "#");
      double[] trainY =
        MatToVec(MatLoad(trainFile,
        new int[] { 5 }, ',', "#"));

      Console.WriteLine("\nFirst three train X: ");
      for (int i = 0; i "lt" 3; ++i)
        VecShow(trainX[i], 4, 8);

      Console.WriteLine("\nFirst three train y: ");
      for (int i = 0; i "lt" 3; ++i)
        Console.WriteLine(trainY[i].ToString("F4").
          PadLeft(8));

      Console.WriteLine("\nCreating standard " +
        "scaler for trainX ");
      StandardScaler scaler = new StandardScaler();
      scaler.Fit(trainX);
      Console.WriteLine("Done ");

      Console.WriteLine("\nColumn means: ");
      VecShow(scaler.means, 4, 8);
      Console.WriteLine("Column variances: ");
      VecShow(scaler.variances, 4, 8);

      Console.WriteLine("\nTransforming data ");
      double[][] scaledTrainX = scaler.Transform(trainX);
      Console.WriteLine("Done ");

      Console.WriteLine("\nFirst three scaled train X: ");
      for (int i = 0; i "lt" 3; ++i)
        VecShow(scaledTrainX[i], 4, 8);

      Console.WriteLine("\nUnscaling scaled data ");
      double[][] origX = 
        scaler.InverseTransform(scaledTrainX);
     
      Console.WriteLine("\nFirst three original train X: ");
      for (int i = 0; i "lt" 3; ++i)
        VecShow(origX[i], 4, 8);

      Console.WriteLine("\nEnd demo ");
      Console.ReadLine();
    } // Main

    // ------------------------------------------------------
    // helpers for Main()
    // ------------------------------------------------------

    static double[][] MatLoad(string fn, int[] usecols,
      char sep, string comment)
    {
      List"lt"double[]"gt" result = 
        new List"lt"double[]"gt"();
      string line = "";
      FileStream ifs = new FileStream(fn, FileMode.Open);
      StreamReader sr = new StreamReader(ifs);
      while ((line = sr.ReadLine()) != null)
      {
        if (line.StartsWith(comment) == true)
          continue;
        string[] tokens = line.Split(sep);
        List"lt"double"gt" lst = new List"lt"double"gt"();
        for (int j = 0; j "lt" usecols.Length; ++j)
          lst.Add(double.Parse(tokens[usecols[j]]));
        double[] row = lst.ToArray();
        result.Add(row);
      }
      sr.Close(); ifs.Close();
      return result.ToArray();
    }

    static double[] MatToVec(double[][] M)
    {
      int nRows = M.Length;
      int nCols = M[0].Length;
      double[] result = new double[nRows * nCols];
      int k = 0;
      for (int i = 0; i "lt" nRows; ++i)
        for (int j = 0; j "lt" nCols; ++j)
          result[k++] = M[i][j];
      return result;
    }

    static void VecShow(double[] vec, int dec, int wid)
    {
      for (int i = 0; i "lt" vec.Length; ++i)
        Console.Write(vec[i].ToString("F" + dec).
          PadLeft(wid));
      Console.WriteLine("");
    }

  } // Program

  // ========================================================

  public class StandardScaler
  {
    public double[] means;
    public double[] variances;

    public StandardScaler()
    {
      this.means = new double[0];
      this.variances = new double[0];
    }

    // ------------------------------------------------------

    public void Fit(double[][] dataX)
    {
      int n = dataX.Length;
      int dim = dataX[0].Length;

      this.means = new double[dim];
      this.variances = new double[dim];

      for (int j = 0; j "lt" dim; ++j) // each col
      {
        double sum = 0.0;
        for (int i = 0; i "lt" n; ++i)
          sum += dataX[i][j];
        this.means[j] = sum / n;
      }

      for (int j = 0; j "lt" dim; ++j) // each col
      {
        double sum = 0.0;
        for (int i = 0; i "lt" n; ++i)
          sum += (dataX[i][j] - this.means[j]) *
            (dataX[i][j] - this.means[j]);
        this.variances[j] = sum / n;
      }
    }

    // ------------------------------------------------------

    public double[][] Transform(double[][] dataX)
    {
      // x' = (x - u) / sd
      int n = dataX.Length;
      int dim = dataX[0].Length;

      double[][] result = new double[n][];
      for (int i = 0; i "lt" n; ++i)
        result[i] = new double[dim];

      for (int j = 0; j "lt" dim; ++j)
      {
        for (int i = 0; i "lt" n; ++i)
        {
          double x = dataX[i][j];
          double u = this.means[j];
          double sd = Math.Sqrt(this.variances[j]);
          result[i][j] = (x - u) / sd;
        }
      }
      return result;
    }

    // ------------------------------------------------------

    public double[][] InverseTransform(double[][] scaledX)
    {
      // x = (x' * sd) + u

      int n = scaledX.Length;
      int dim = scaledX[0].Length;

      double[][] result = new double[n][];
      for (int i = 0; i "lt" n; ++i)
        result[i] = new double[dim];

      for (int j = 0; j "lt" dim; ++j)
      {
        for (int i = 0; i "lt" n; ++i)
        {
          double x = scaledX[i][j];
          double u = this.means[j];
          double sd = Math.Sqrt(this.variances[j]);
          result[i][j] = (x * sd) + u;
        }
      }
      return result;
    }

    // ------------------------------------------------------

  } // class StandardScaler

  // ========================================================

} // ns

Test data:

# synthetic_train_50.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply