Gradient Boosting Regression From Scratch Using Python

Gradient boosting regression (GBR) is a common machine learning technique to predict a single numeric value. The technique combines many small decision trees (typically 100 or 200) to produce a final prediction. One morning before work, I figured I’d put together a quick demo of GBR from scratch using Python. By “scratch” I mean the both the top-level MyGradientBoostingRegressor class and the helper MyDecisionTreeRegressor class are implemented without any external dependencies.

My implementation is based mostly on some Google search AI-generated code. That AI code appears to be closely related to an old blog post by a guy named Tomonori Masui, and his blog post appears to be closely related to a now-vanished blog post by a guy named Matt Bowers.

Gradient boosting is simple but very subtle. As each tree in the collection is constructed, it predicts the residuals of the previous tree. A residual is the difference between actual y and predicted y. Mathematically, a residual is (almost) a gradient, hence the name of the technique. For a given input, a prediction starts out as the average of the target y values. The predicted residuals of each tree — some will be negative and some will be positive — are accumulated, and the final sum is the predicted y value. The idea is very subtle.

For my demo, I used a set of synthetic data that I generated using a neural network with random weights and biases. The data looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065, 0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562, 0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192, 0.8054
. . .

The first five values on each line are the predictors. The sixth value is the target to predict. All predictor values are between -1.0 and 1.0. Normalizing the predictor values is not necessary but is helpful when using the data with other regression techniques that require normalization (such as k-nearest neighbors regression). There are 200 items in the training data and 40 items in the test data.

The output of the from-scratch gradient boosting regression demo program is:

Begin gradient boost regression from scratch

Loading synthetic train (200), test (40) data
Done

First three X predictors:
[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]
[ 0.0776 -0.1616  0.3704 -0.5911  0.7562]
[-0.9452  0.3409 -0.1654  0.1174 -0.7192]

First three y targets:
0.4840
0.1568
0.8054

Setting n_estimators = 200
Setting max_depth = 3
Setting min_samples = 2
Setting lrn_rate = 0.0500

Training model
Done

Accuracy train (within 0.10): 0.9850
Accuracy test (within 0.10): 0.7000

MSE train: 0.0001
MSE test: 0.0011

Predicting for:
[[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]]
Predicted y = 0.4862
==============================

Using scikit GradientBoostingRegressor:

Accuracy train (within 0.10): 0.9850
Accuracy test (within 0.10): 0.6500

MSE train: 0.0001
MSE test: 0.0012
Predicting for:
[[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]]
Predicted y = 0.4862

End demo

I implemented an accuracy() function that scores a prediction as correct if it’s within a specified percentage of the true target y value. I specified 10% as an arbitrary percentage to use.

I ran the synthetic data through the built-in scikit GradientBoostingRegressor module. Using the same model parameters — n_estimators = 200, max_depth = 3, min_samples = 2, lrn_rate = 0.0500 — the built-in scikit module produced very similar results. (The difference is due to the way the underlying decision trees work when computing splits).

There are several advanced variations of gradient boosting regression libraries available. XGBoost (“Extreme Gradient Boost”) and LightGBM are two I’ve used.

Good fun and a good way to start my day.

I’ve always been fascinated by mathematical predictive models. I stumbled across a person who made a fantastic 1/12 scale model of the 221B Baker Street home of Sherlock Holmes. Absolutely beautiful. These images don’t do justice to the incredible detail. Click to enlarge.

Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My blog editor chokes on symbols).

# gradient_boost_regression_scratch.py

import numpy as np
# from sklearn.tree import DecisionTreeRegressor  # validation

# ===========================================================

class MyGradientBoostingRegressor:
  def __init__(self, lrn_rate, n_estimators,
    max_depth=2, min_samples=2, seed=0):
    self.lrn_rate = lrn_rate
    self.n_estimators = n_estimators
    self.max_depth = max_depth
    self.min_samples = min_samples
    self.trees = []
    self.pred_0 = 0.0
    self.seed = seed  # passed to base decision tree
        
  def fit(self, X, y):
    self.pred_0 = y.mean()  # initial prediction(s)
    preds = np.full(len(X), self.pred_0)
    for t in range(self.n_estimators):  # each tree
      gradients = preds - y

      # using scikit trees
      # dtr = DecisionTreeRegressor(max_depth=self.max_depth,
      #   min_samples_split=self.min_samples, random_state=0)

      # using from-scratch trees
      dtr = MyDecisionTreeRegressor(max_depth=self.max_depth,
        min_samples=self.min_samples, seed=self.seed)

      dtr.fit(X, gradients)
      gammas = dtr.predict(X)  # update vals
      preds -= self.lrn_rate * gammas
      self.trees.append(dtr)
            
  def predict(self, X):
    results = np.full(len(X), self.pred_0)
    for i in range(self.n_estimators):
      results -= self.lrn_rate * self.trees[i].predict(X)
    return results  # sum of predictions on residuals

# ===========================================================
# ===========================================================

class MyDecisionTreeRegressor:  # avoid scikit name collision
  # if max_depth = n, tree has at most 2^(n+1) - 1 nodes.

  def __init__(self, max_depth=3, min_samples=2,
    n_split_cols=-1, seed=0):
    self.max_depth = max_depth
    self.min_samples = min_samples # aka min_samples_split
    self.n_split_cols = n_split_cols  # mostly random forest
    self.root = None
    self.rnd = np.random.RandomState(seed) # split col order

  # ===============================================

  class Node:
    def __init__(self, id=0, col_idx=-1, thresh=0.0,
        left=None, right=None, value=0.0, is_leaf=False):
      self.id = id  # useful for debugging
      self.col_idx = col_idx
      self.thresh = thresh
      self.left = left
      self.right = right
      self.value = value
      self.is_leaf = is_leaf  # False for an in-node

  # ===============================================

  def best_split(self, X, y):
    best_col_idx = -1  # indicates a bad split
    best_thresh = 0.0
    best_mse = np.inf  # smaller is better
    n_rows, n_cols = X.shape

    rnd_cols = np.arange(n_cols)
    self.rnd.shuffle(rnd_cols)
    if self.n_split_cols != -1:  # just use some cols
      rnd_cols = rnd_cols[0:self.n_split_cols]

    for j in range(len(rnd_cols)):
      col_idx = rnd_cols[j]
      examined_threshs = set()
      for i in range(n_rows):
        thresh = X[i][col_idx]  # candidate threshold value

        if thresh in examined_threshs == True:
          continue
        examined_threshs.add(thresh)

        # get rows where x is lte, gt thresh
        left_idxs = np.where(X[:,col_idx] "lte" thresh)[0]
        right_idxs = np.where(X[:,col_idx] "gt" thresh)[0]

        # check proposed split
        if len(left_idxs) == 0 or \
          len(right_idxs) == 0:
          continue

        # get left and right y values
        left_y_vals = y[left_idxs]  # not empty
        right_y_vals = y[right_idxs]  # not empty

        # compute proposed split MSE
        mse_left = self.vector_mse(left_y_vals)
        mse_right = self.vector_mse(right_y_vals)
        split_mse = (len(left_y_vals) * mse_left + \
          len(right_y_vals) * mse_right) / n_rows

        if split_mse "lt" best_mse:
          best_col_idx = col_idx
          best_thresh = thresh
          best_mse = split_mse          

    return best_col_idx, best_thresh  # -1 is bad/no split

  # ---------------------------------------------------------

  def vector_mse(self, y):  # variance but called MSE
    if len(y) == 0:
      return 0.0  # should never get here
    return np.var(y)

  # ---------------------------------------------------------

  def make_tree(self, X, y):
    root = self.Node()  # is_leaf is False
    stack = [(root, X, y, 0)]  # curr depth = 0

    while (len(stack) "gt" 0):
      curr_node, curr_X, curr_y, curr_depth = stack.pop()

      if curr_depth == self.max_depth or \
        len(curr_y) "lt" self.min_samples:
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue

      col_idx, thresh = self.best_split(curr_X, curr_y) 

      if col_idx == -1:  # cannot split
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue
      
      # got a good split so at an internal, non-leaf node
      curr_node.col_idx = col_idx
      curr_node.thresh = thresh

      # create and attach child nodes
      left_idxs = np.where(curr_X[:,col_idx] "lte" thresh)[0]
      right_idxs = np.where(curr_X[:,col_idx] "gt" thresh)[0]

      left_X = curr_X[left_idxs,:]
      left_y = curr_y[left_idxs]
      right_X = curr_X[right_idxs,:]
      right_y = curr_y[right_idxs]

      curr_node.left = self.Node(id=2*curr_node.id+1)
      stack.append((curr_node.left,
        left_X, left_y, curr_depth+1))

      curr_node.right = self.Node(id=2*curr_node.id+2)
      stack.append((curr_node.right,
        right_X, right_y, curr_depth+1))
      
    return root

  # ---------------------------------------------------------

  def fit(self, X, y):
    self.root = self.make_tree(X, y)

  # ---------------------------------------------------------

  def predict_one(self, x):
    curr = self.root
    while curr.is_leaf == False:
      if x[curr.col_idx] "lte" curr.thresh:
        curr = curr.left
      else:
        curr = curr.right
    return curr.value

  def predict(self, X):  # scikit always uses a matrix input
    result = np.zeros(len(X), dtype=np.float64)
    for i in range(len(X)):
      result[i] = self.predict_one(X[i])
    return result

  # ---------------------------------------------------------

# ===========================================================
# ===========================================================

# -----------------------------------------------------------

def accuracy(model, data_X, data_y, pct_close):
  n = len(data_X)
  n_correct = 0; n_wrong = 0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)

    if np.abs(y - y_pred) "lt" np.abs(y * pct_close):
      n_correct += 1
    else: 
      n_wrong += 1
  # print("Correct = " + str(n_correct))
  # print("Wrong   = " + str(n_wrong))
  return n_correct / (n_correct + n_wrong)

# -----------------------------------------------------------

def MSE(model, data_X, data_y):
  n = len(data_X)
  sum = 0.0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)
    sum += (y - y_pred) * (y - y_pred)

  return sum / n

# -----------------------------------------------------------

def main():
  print("\nBegin gradient boost regression from scratch ")
  np.set_printoptions(precision=4, suppress=True,
    floatmode='fixed')
  np.random.seed(0)  # not used this version

  # 1. load data
  print("\nLoading synthetic train (200), test (40) data ")
  train_file = ".\\Data\\synthetic_train_200.txt"
  # -0.1660,0.4406,-0.9998,-0.3953,-0.7065,0.4840
  #  0.0776,-0.1616,0.3704,-0.5911,0.7562,0.1568
  # -0.9452,0.3409,-0.1654,0.1174,-0.7192,0.8054
  # . . .

  train_X = np.loadtxt(train_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  train_y = np.loadtxt(train_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)

  test_file = ".\\Data\\synthetic_test_40.txt"
  test_X = np.loadtxt(test_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  test_y = np.loadtxt(test_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)
  print("Done ")

  print("\nFirst three X predictors: ")
  for i in range(3):
    print(train_X[i])
  print("\nFirst three y targets: ")
  for i in range(3):
    print("%0.4f" % train_y[i])

  # 2. create and train model
  n_estimators = 200
  max_depth = 3
  min_samples = 2
  lrn_rate = 0.05

  print("\nSetting n_estimators = " + str(n_estimators))
  print("Setting max_depth = " + str(max_depth))
  print("Setting min_samples = " + str(min_samples))
  print("Setting lrn_rate = %0.4f " % lrn_rate)
  print("\nTraining model ")

  model = MyGradientBoostingRegressor(lrn_rate,
    n_estimators, max_depth, min_samples) 
  model.fit(train_X, train_y)
  print("Done ")

  # 3. evaluate model
  acc_train = accuracy(model, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(model, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(model, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(model, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  # 4. use model
  x = train_X[0].reshape(1,-1)
  print("\nPredicting for: ")
  print(x)
  y_pred = model.predict(x)
  print("Predicted y = %0.4f " % y_pred)

  print("\n============================== ")

  print("\nUsing scikit GradientBoostingRegressor: ")
  from sklearn.ensemble import GradientBoostingRegressor
  sk_gbm = GradientBoostingRegressor(n_estimators=200, 
    learning_rate=0.05, max_depth=max_depth,
    min_samples_split=min_samples, random_state=0)
  sk_gbm.fit(train_X, train_y)

  acc_train = accuracy(sk_gbm, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(sk_gbm, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(sk_gbm, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(sk_gbm, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  print("\nPredicting for: ")
  print(x)
  y_pred = sk_gbm.predict(x)
  print("Predicted y = %0.4f " % y_pred)

  print("\nEnd demo ")

if __name__ == "__main__":
  main()

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386