Bagging Tree Regression from Scratch Using Python

Naive decision tree regression prediction models usually overfit the training data. The model is accurate on the training data, but has poor accuracy and MSE on new, previously unseen data.

One of several ways to deal with decision tree overfitting is to create a collection of trees, train each tree on a subset of the full training data. Then to predict, compute the average of the predictions of all the trees. Simple.

There are two closely related tree ensemble techniques: bagging (“bootstrap aggregation”) tree regression, and random forest regression. In bagging tree regression, training looks like:

create an empty collection of decision trees
loop each number of trees times
  create a rows-subset of training data
  train curr tree using subset
  add trained tree to collection of trees
end loop

The rows-subset can have the same number of rows as the source training data, or fewer rows. When rows are randomly selected, the selection is done “with replacement” so a specific row might be included in the subset more than once, and some rows might not be included at all. The training data subset uses all columns.

In random forest regression, during training, when each split is computed, a different columns-subset of the rows-subset of the training data is used:

create an empty collection of decision trees
loop each number of trees times
  create a rows-subset of training data
  train curr tree using rows-subset, but
    with random columns used for each training split
  add trained tree to collection of trees
end loop

In other words, bagging and random forest are the same except that random forest uses only some of the columns of training data when finding split values, while bagging uses all the columns. Put another way, bagging regression is a type of random forest regression.

It turns out that bagging tree regression was invented first (Breiman, 1996) but then he realized the random forest was better (2001) but it was too late to drop the now-redundant “bagging tree” terminology.

Here’s my demo code method that trains a bagging tree regressor:

  def fit(self, X, y):
    for i in range(self.n_trees):
      curr_tree = \
        MyDecisionTreeRegressor(max_depth=self.max_depth,
          min_samples=self.min_samples, seed=0)

      # create random rows-subset of training data
      rnd_rows = self.rnd.choice(self.n_rows, 
        size=(self.n_rows), replace=True)
      subset_X = X[rnd_rows,:]
      subset_y = y[rnd_rows]

      # train tree on subset and add to list
      curr_tree.fit(subset_X, subset_y)
      self.trees.append(curr_tree)

For my demo, I used a set of synthetic data that I generated using a neural network with random weights and biases. The data looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065, 0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562, 0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192, 0.8054
. . .

The first five values on each line are the predictors. The sixth value is the target to predict. All predictor values are between -1.0 and 1.0. Normalizing the predictor values is not necessary but is helpful when using the data with other regression techniques that require normalization (such as k-nearest neighbors regression). There are 200 items in the training data and 40 items in the test data.

The output of the from-scratch bagging tree regression demo program is:

Begin bagging tree regression scratch Python

Loading synthetic train (200), test (40) data
Done

First three X predictors:
[[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]
 [ 0.0776 -0.1616  0.3704 -0.5911  0.7562]
 [-0.9452  0.3409 -0.1654  0.1174 -0.7192]]

First three y targets:
0.4840
0.1568
0.8054

Setting num_trees = 10
Setting num_rows = 200
Setting max_depth = 6
Setting min_samples = 2

Creating and training bagging tree model
Done

Accuracy train (within 0.10): 0.7950
Accuracy test (within 0.10): 0.5250

MSE train: 0.0005
MSE test: 0.0027

End bagging tree regression scratch Python demo

====================

Using scikit BaggingRegressor:

Setting n_estimators = 10

Creating and training scikit bagging tree model
Done

Accuracy train (within 0.10): 0.7850
Accuracy test (within 0.10): 0.6250

MSE train: 0.0005
MSE test: 0.0013

I validated my demo by running the data through the scikit-learn Python language library BaggingRegressor module. The results were essentially the same. But for both systems, the prediction model was overfitted. The effectiveness of bagging tree regression depends on the data — sometimes bagging tree works well, sometimes not.

My demo works correctly, but there are several things in the MyDecisionTreeRegressor class that I’m not at all happy with. I’ll revamp this code when I get the chance. . .

Good fun.

I’ve never really liked the term “bagging” as a shortcut for “bootstrap aggregation”. When I think of bagging, I usually think about the time-honored tradition of sports fans who are dissatisfied with their team. Left: This Detroit Lions fan has a nice knitted cap accessory to add some style to his bag. Center: This Cleveland Brows fan appears to be doubly-disappointed. Right: This New York Jets fan did not do a very good job with his bag head design.

Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My blog editor chokes on symbols).

# bagging_tree_regression_scratch.py

# each tree is trained on a subset of the data, with some
# rows possibly duplicated, and some rows possibly not used.
# subset uses all columns.

import numpy as np

# ===========================================================

class BaggingTreeRegressor:
  def __init__(self, n_trees, n_rows, max_depth=3,
    min_samples=2, seed=1):
    self.n_trees = n_trees
    self.n_rows = n_rows
    self.max_depth = max_depth
    self.min_samples = min_samples
    self.trees = []
    self.rnd = np.random.RandomState(seed)

  def fit(self, X, y):
    for i in range(self.n_trees):
      curr_tree = \
        MyDecisionTreeRegressor(max_depth=self.max_depth,
          min_samples=self.min_samples, seed=0)

      # create random rows-subset of training data
      rnd_rows = self.rnd.choice(self.n_rows, 
        size=(self.n_rows), replace=True)
      subset_X = X[rnd_rows,:]
      subset_y = y[rnd_rows]

      # train tree on subset and add to list
      curr_tree.fit(subset_X, subset_y)
      self.trees.append(curr_tree)

  def predict_one(self, x):
    sum = 0.0
    for i in range(self.n_trees):
      pred_y = self.trees[i].predict_one(x)
      sum += pred_y
    return sum / self.n_trees

  def predict(self, X):
    result = np.zeros(len(X), dtype=np.float64)
    for i in range(len(X)):
      result[i] = self.predict_one(X[i])
    return result    

# ===========================================================

# ===========================================================

class MyDecisionTreeRegressor:  # avoid scikit name collision
  # if max_depth = n, tree has at most 2^(n+1) - 1 nodes.

  def __init__(self, max_depth=3, min_samples=2,
    n_split_cols=-1, seed=0):
    self.max_depth = max_depth
    self.min_samples = min_samples # aka min_samples_split
    self.n_split_cols = n_split_cols  # mostly random forest
    self.root = None
    self.rnd = np.random.RandomState(seed) # split col order

  # ===============================================

  class Node:
    def __init__(self, id=0, col_idx=-1, thresh=0.0,
        left=None, right=None, value=0.0, is_leaf=False):
      self.id = id  # useful for debugging
      self.col_idx = col_idx
      self.thresh = thresh
      self.left = left
      self.right = right
      self.value = value
      self.is_leaf = is_leaf  # False for an in-node

  # ===============================================

  def best_split(self, X, y):
    best_col_idx = -1  # indicates a bad split
    best_thresh = 0.0
    best_mse = np.inf  # smaller is better
    n_rows, n_cols = X.shape

    rnd_cols = np.arange(n_cols)
    self.rnd.shuffle(rnd_cols)
    if self.n_split_cols != -1:  # just use some cols
      rnd_cols = rnd_cols[0:self.n_split_cols]

    for j in range(len(rnd_cols)):
      col_idx = rnd_cols[j]
      examined_threshs = set()
      for i in range(n_rows):
        thresh = X[i][col_idx]  # candidate threshold value

        if thresh in examined_threshs == True:
          continue
        examined_threshs.add(thresh)

        # get rows where x is lte, gt thresh
        left_idxs = np.where(X[:,col_idx] "lte" thresh)[0]
        right_idxs = np.where(X[:,col_idx] "gt" thresh)[0]

        # check proposed split
        if len(left_idxs) == 0 or \
          len(right_idxs) == 0:
          continue

        # get left and right y values
        left_y_vals = y[left_idxs]  # not empty
        right_y_vals = y[right_idxs]  # not empty

        # compute proposed split MSE
        mse_left = self.vector_mse(left_y_vals)
        mse_right = self.vector_mse(right_y_vals)
        split_mse = (len(left_y_vals) * mse_left + \
          len(right_y_vals) * mse_right) / n_rows

        if split_mse "lt" best_mse:
          best_col_idx = col_idx
          best_thresh = thresh
          best_mse = split_mse          

    return best_col_idx, best_thresh  # -1 is bad/no split

  # ---------------------------------------------------------

  def vector_mse(self, y):  # variance but called MSE
    if len(y) == 0:
      return 0.0  # should never get here
    # return np.mean((y - np.mean(y)) ** 2)
    return np.var(y)

  # ---------------------------------------------------------

  def make_tree(self, X, y):
    root = self.Node()  # is_leaf is False
    stack = [(root, X, y, 0)]  # curr depth = 0

    while (len(stack) "gt" 0):
      curr_node, curr_X, curr_y, curr_depth = stack.pop()

      if curr_depth == self.max_depth or \
        len(curr_y) "lt" self.min_samples:
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue

      col_idx, thresh = self.best_split(curr_X, curr_y) 

      if col_idx == -1:  # cannot split
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue
      
      # got a good split so at an internal, non-leaf node
      curr_node.col_idx = col_idx
      curr_node.thresh = thresh

      # create and attach child nodes
      left_idxs = np.where(curr_X[:,col_idx] "lte" thresh)[0]
      right_idxs = np.where(curr_X[:,col_idx] "gt" thresh)[0]

      left_X = curr_X[left_idxs,:]
      left_y = curr_y[left_idxs]
      right_X = curr_X[right_idxs,:]
      right_y = curr_y[right_idxs]

      curr_node.left = self.Node(id=2*curr_node.id+1)
      stack.append((curr_node.left,
        left_X, left_y, curr_depth+1))

      curr_node.right = self.Node(id=2*curr_node.id+2)
      stack.append((curr_node.right,
        right_X, right_y, curr_depth+1))
      
    return root

  # ---------------------------------------------------------      

  def fit(self, X, y):
    self.root = self.make_tree(X, y)

  # ---------------------------------------------------------

  def predict_one(self, x):
    curr = self.root
    while curr.is_leaf == False:
      if x[curr.col_idx] "lte" curr.thresh:
        curr = curr.left
      else:
        curr = curr.right
    return curr.value

  def predict(self, X):  # scikit always uses a matrix input
    result = np.zeros(len(X), dtype=np.float64)
    for i in range(len(X)):
      result[i] = self.predict_one(X[i])
    return result

  # ---------------------------------------------------------

# ===========================================================



# ===========================================================

# -----------------------------------------------------------

def accuracy(model, data_X, data_y, pct_close):
  # assumes model has a predict(X)
  n = len(data_X)
  n_correct = 0; n_wrong = 0
  for i in range(n):
    x = data_X[i].reshape(1,-1)  # make it a matrix
    y = data_y[i]
    y_pred = model.predict(x)  # predict() expects 2D

    if np.abs(y - y_pred) "lt" np.abs(y * pct_close):
      n_correct += 1
    else: 
      n_wrong += 1
  # print("Correct = " + str(n_correct))
  # print("Wrong   = " + str(n_wrong))
  return n_correct / (n_correct + n_wrong)

# -----------------------------------------------------------

def MSE(model, data_X, data_y):
  n = len(data_X)
  sum = 0.0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)
    sum += (y - y_pred) * (y - y_pred)

  return sum / n

# -----------------------------------------------------------

def main():
  print("\nBegin bagging tree regression scratch Python ")

  np.set_printoptions(precision=4, suppress=True,
    floatmode='fixed')
  np.random.seed(0)  # not used this version

  # 1. load data
  print("\nLoading synthetic train (200), test (40) data ")
  train_file = ".\\Data\\synthetic_train_200.txt"
  # -0.1660,0.4406,-0.9998,-0.3953,-0.7065,0.4840
  #  0.0776,-0.1616,0.3704,-0.5911,0.7562,0.1568
  # -0.9452,0.3409,-0.1654,0.1174,-0.7192,0.8054
  # . . .

  train_X = np.loadtxt(train_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  train_y = np.loadtxt(train_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)

  test_file = ".\\Data\\synthetic_test_40.txt"
  test_X = np.loadtxt(test_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  test_y = np.loadtxt(test_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)
  print("Done ")

  print("\nFirst three X predictors: ")
  print(train_X[0:3,:])
  print("\nFirst three y targets: ")
  for i in range(3):
    print("%0.4f" % train_y[i])

  # 2. create and train model
  nt = 10  # number trees
  nr = 200  # number rows
  md = 6  # max_depth
  ms = 2  # min_samples to consider a split

  print("\nSetting num_trees = " + str(nt))
  print("Setting num_rows = " + str(nr))
  print("Setting max_depth = " + str(md))
  print("Setting min_samples = " + str(ms))

  print("\nCreating and training bagging tree model ")
  model = BaggingTreeRegressor(n_trees=nt, n_rows=nr,
    max_depth=md, min_samples=ms, seed=0)
  model.fit(train_X, train_y)
  print("Done ")

  # 3. evaluate model
  acc_train = accuracy(model, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(model, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(model, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(model, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  # 4. use model
  x = train_X[0].reshape(1,-1)
  print("\nPredicting for: ")
  print(x)
  y_pred = model.predict(x)
  print("Predicted y = %0.4f " % y_pred)

  print("\nEnd bagging tree regression scratch Python demo ")

  print("\n==================== ")

  print("\nUsing scikit BaggingRegressor: ")

  from sklearn.ensemble import BaggingRegressor
  from sklearn.tree import DecisionTreeRegressor

  n_estimators = 10
  print("\nSetting n_estimators = " + str(n_estimators))
  print("\nCreating and training scikit bagging tree model ")
  bt = \
    BaggingRegressor(estimator=\
      DecisionTreeRegressor(max_depth=md,
      min_samples_split=ms),
      n_estimators=10, random_state=0)
  bt.fit(train_X, train_y)
  print("Done ")

  acc_train = accuracy(bt, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(bt, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(bt, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(bt, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  x = train_X[0].reshape(1,-1)
  print("\nPredicting for: ")
  print(x)
  y_pred = bt.predict(x)
  print("Predicted y = %0.4f " % y_pred)

if __name__ == "__main__":
  main()

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386