Random Forest Regression from Scratch Using Python

Naive decision tree regression prediction models usually overfit the training data. The model is accurate on the training data, but has poor accuracy and MSE on new, previously unseen data.

One of several ways to deal with decision tree overfitting is to create a collection of trees, train each tree on a subset of the full training data. Then to predict, compute the average of the predictions of all the trees. Simple.

There are two closely related simple tree ensemble techniques: random forest regression, and bagging (“bootstrap aggregation”) tree regression. (More sophisticated ensemble techniques are adaptive boosting and gradient boosting).

create an empty collection of decision trees
loop each number of trees times
  create a rows-subset of training data
  train curr tree using rows-subset, but
    with random columns used for each training split
  add trained tree to collection of trees
end loop

The rows-subset can have the same number of rows as the source training data, or fewer rows. When rows are randomly selected, the selection is done “with replacement” so a specific row might be included in the subset more than once, and some rows might not be included at all. The training data subset starts with uses all columns. But in random forest regression, during training, when each split is computed, a different columns-subset of the rows-subset of the training data is used.

In bagging tree regression, everything is the same except there is no selection of random columns during splits — all columns are used.

When I write code, I often sketch out the data structures involved using paper and pencil. I grew up just after World War II, in a time before ball point pens and copy machines, and so using paper and pencil puts me in my intellectual comfort zone.

If you have a decision tree implementation, then implementing random forest regression from scratch is relatively easy. You do have to modify the decision tree regressor code to select random columns during the split process, but this is relatively easy. If you don’t have the source code for a decision tree regressor, I don’t think implementing a random forest regressor is feasible.

Here’s my demo code method that trains a random regressor:

  def fit(self, X, y):
    for i in range(self.n_trees):
      # special tree that allows random columns during split
      curr_tree = \
        MyDecisionTreeRegressor(max_depth=self.max_depth,
          min_samples=self.min_samples,
          n_split_cols=self.n_split_cols)

      # create random rows-subset of training data
      # for each tree
      rnd_rows = self.rnd.choice(self.n_rows, 
        size=(self.n_rows), replace=True)
      subset_X = X[rnd_rows,:]
      subset_y = y[rnd_rows]

      # train tree on subset and add to list
      # random cols will be used during splits
      curr_tree.fit(subset_X, subset_y)
      self.trees.append(curr_tree)

For my demo, I used a set of synthetic data that I generated using a neural network with random weights and biases. The data looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065, 0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562, 0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192, 0.8054
. . .

The first five values on each line are the predictors. The sixth value is the target to predict. All predictor values are between -1.0 and 1.0. Normalizing the predictor values is not necessary but is helpful when using the data with other regression techniques that require normalization (such as k-nearest neighbors regression). There are 200 items in the training data and 40 items in the test data.

The output of the from-scratch random forest regression demo program is:

Begin random forest regression scratch Python

Loading synthetic train (200), test (40) data
Done

First three X predictors:
[[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]
 [ 0.0776 -0.1616  0.3704 -0.5911  0.7562]
 [-0.9452  0.3409 -0.1654  0.1174 -0.7192]]

First three y targets:
0.4840
0.1568
0.8054

Setting num_trees = 10
Setting n_rows = 200
Setting n_split_cols = 4
Setting max_depth = 6
Setting min_samples = 2

Creating and training random forest model
Done

Accuracy train (within 0.10): 0.7800
Accuracy test (within 0.10): 0.5750

MSE train: 0.0006
MSE test: 0.0020

End random forest regression scratch Python demo

====================

Using scikit RandomForestRegressor:

Creating and training scikit Random Forest model
Done

Accuracy train (within 0.10): 0.7850
Accuracy test (within 0.10): 0.6500

MSE train: 0.0005
MSE test: 0.0019

I validated my from-scratch demo by running the data through the scikit-learn Python language library RandomForestRegressor module. The results were essentailly the same (given the randomness of column selection during tree splits).

Much fun.

Three nice images (to my eye anyway) of an Internet search for “alien forest”.

Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My blog editor chokes on symbols).

# random_forest_regression_scratch.py

# each tree is trained on a subset of the data, with some
# rows possibly duplicated, and some rows possibly not used.
# subset uses all columns, but during training, at each
# split, only n_splt_cols randomly selected columns are used.

import numpy as np

# ===========================================================

class MyRandomForestRegressor:  # avoid scikit name collision
  def __init__(self, n_trees, n_rows, n_split_cols,
    max_depth=3, min_samples=2, seed=0):
    self.n_trees = n_trees
    self.n_rows = n_rows  # for main subset
    self.n_split_cols = n_split_cols  # each split calculation
    self.max_depth = max_depth
    self.min_samples = min_samples
    self.trees = []
    self.rnd = np.random.RandomState(seed)

  def fit(self, X, y):
    for i in range(self.n_trees):
      # special tree that allows random columns during split
      curr_tree = \
        MyDecisionTreeRegressor(max_depth=self.max_depth,
          min_samples=self.min_samples,
          n_split_cols=self.n_split_cols)

      # create random rows-subset of training data
      # for each tree
      rnd_rows = self.rnd.choice(self.n_rows, 
        size=(self.n_rows), replace=True)
      subset_X = X[rnd_rows,:]
      subset_y = y[rnd_rows]

      # train tree on subset and add to list
      # random cols will be used during splits
      curr_tree.fit(subset_X, subset_y)
      self.trees.append(curr_tree)

  def predict_one(self, x):
    sum = 0.0
    for i in range(self.n_trees):
      pred_y = self.trees[i].predict_one(x)
      sum += pred_y
    return sum / self.n_trees

  def predict(self, X):
    result = np.zeros(len(X), dtype=np.float64)
    for i in range(len(X)):
      result[i] = self.predict_one(X[i])
    return result    

# ===========================================================


# ===========================================================

class MyDecisionTreeRegressor:  # avoid scikit name collision
  # if max_depth = n, tree has at most 2^(n+1) - 1 nodes.

  def __init__(self, max_depth=3, min_samples=2,
    n_split_cols=-1, seed=0):
    self.max_depth = max_depth
    self.min_samples = min_samples # aka min_samples_split
    self.n_split_cols = n_split_cols  # mostly random forest
    self.root = None
    self.rnd = np.random.RandomState(seed) # split col order

  # ===============================================

  class Node:
    def __init__(self, id=0, col_idx=-1, thresh=0.0,
        left=None, right=None, value=0.0, is_leaf=False):
      self.id = id  # useful for debugging
      self.col_idx = col_idx
      self.thresh = thresh
      self.left = left
      self.right = right
      self.value = value
      self.is_leaf = is_leaf  # False for an in-node

  # ===============================================

  def best_split(self, X, y):
    best_col_idx = -1  # indicates a bad split
    best_thresh = 0.0
    best_mse = np.inf  # smaller is better
    n_rows, n_cols = X.shape

    rnd_cols = np.arange(n_cols)
    self.rnd.shuffle(rnd_cols)
    if self.n_split_cols != -1:  # just use some cols
      rnd_cols = rnd_cols[0:self.n_split_cols]

    for j in range(len(rnd_cols)):
      col_idx = rnd_cols[j]
      examined_threshs = set()
      for i in range(n_rows):
        thresh = X[i][col_idx]  # candidate threshold value

        if thresh in examined_threshs == True:
          continue
        examined_threshs.add(thresh)

        # get rows where x is lte, gt thresh
        left_idxs = np.where(X[:,col_idx] "lte" thresh)[0]
        right_idxs = np.where(X[:,col_idx] "gt" thresh)[0]

        # check proposed split
        if len(left_idxs) == 0 or \
          len(right_idxs) == 0:
          continue

        # get left and right y values
        left_y_vals = y[left_idxs]  # not empty
        right_y_vals = y[right_idxs]  # not empty

        # compute proposed split MSE
        mse_left = self.vector_mse(left_y_vals)
        mse_right = self.vector_mse(right_y_vals)
        split_mse = (len(left_y_vals) * mse_left + \
          len(right_y_vals) * mse_right) / n_rows

        if split_mse "lt" best_mse:
          best_col_idx = col_idx
          best_thresh = thresh
          best_mse = split_mse          

    return best_col_idx, best_thresh  # -1 is bad/no split

  # ---------------------------------------------------------

  def vector_mse(self, y):  # variance but called MSE
    if len(y) == 0:
      return 0.0  # should never get here
    return np.var(y)

  # ---------------------------------------------------------

  def make_tree(self, X, y):
    root = self.Node()  # is_leaf is False
    stack = [(root, X, y, 0)]  # curr depth = 0

    while (len(stack) "gt" 0):
      curr_node, curr_X, curr_y, curr_depth = stack.pop()

      if curr_depth == self.max_depth or \
        len(curr_y) "lt" self.min_samples:
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue

      col_idx, thresh = self.best_split(curr_X, curr_y) 

      if col_idx == -1:  # cannot split
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue
      
      # got a good split so at an internal, non-leaf node
      curr_node.col_idx = col_idx
      curr_node.thresh = thresh

      # create and attach child nodes
      left_idxs = np.where(curr_X[:,col_idx] "lte" thresh)[0]
      right_idxs = np.where(curr_X[:,col_idx] "gt" thresh)[0]

      left_X = curr_X[left_idxs,:]
      left_y = curr_y[left_idxs]
      right_X = curr_X[right_idxs,:]
      right_y = curr_y[right_idxs]

      curr_node.left = self.Node(id=2*curr_node.id+1)
      stack.append((curr_node.left,
        left_X, left_y, curr_depth+1))

      curr_node.right = self.Node(id=2*curr_node.id+2)
      stack.append((curr_node.right,
        right_X, right_y, curr_depth+1))
      
    return root

  # ---------------------------------------------------------      

  def fit(self, X, y):
    self.root = self.make_tree(X, y)

  # ---------------------------------------------------------

  def predict_one(self, x):
    curr = self.root
    while curr.is_leaf == False:
      if x[curr.col_idx] "lte" curr.thresh:
        curr = curr.left
      else:
        curr = curr.right
    return curr.value

  def predict(self, X):  # scikit always uses a matrix input
    result = np.zeros(len(X), dtype=np.float64)
    for i in range(len(X)):
      result[i] = self.predict_one(X[i])
    return result

  # ---------------------------------------------------------

# ===========================================================
# ===========================================================

# -----------------------------------------------------------

def accuracy(model, data_X, data_y, pct_close):
  # assumes model has a predict(X)
  n = len(data_X)
  n_correct = 0; n_wrong = 0
  for i in range(n):
    x = data_X[i].reshape(1,-1)  # make it a matrix
    y = data_y[i]
    y_pred = model.predict(x)  # predict() expects 2D

    if np.abs(y - y_pred) "lt" np.abs(y * pct_close):
      n_correct += 1
    else: 
      n_wrong += 1
  # print("Correct = " + str(n_correct))
  # print("Wrong   = " + str(n_wrong))
  return n_correct / (n_correct + n_wrong)

# -----------------------------------------------------------

def MSE(model, data_X, data_y):
  n = len(data_X)
  sum = 0.0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)
    sum += (y - y_pred) * (y - y_pred)

  return sum / n

# -----------------------------------------------------------

def main():
  print("\nBegin random forest regression scratch Python ")

  np.set_printoptions(precision=4, suppress=True,
    floatmode='fixed')
  np.random.seed(0)  # not used this version

  # 1. load data
  print("\nLoading synthetic train (200), test (40) data ")
  train_file = ".\\Data\\synthetic_train_200.txt"
  # -0.1660,0.4406,-0.9998,-0.3953,-0.7065,0.4840
  #  0.0776,-0.1616,0.3704,-0.5911,0.7562,0.1568
  # -0.9452,0.3409,-0.1654,0.1174,-0.7192,0.8054
  # . . .

  train_X = np.loadtxt(train_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  train_y = np.loadtxt(train_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)

  test_file = ".\\Data\\synthetic_test_40.txt"
  test_X = np.loadtxt(test_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  test_y = np.loadtxt(test_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)
  print("Done ")

  print("\nFirst three X predictors: ")
  print(train_X[0:3,:])
  print("\nFirst three y targets: ")
  for i in range(3):
    print("%0.4f" % train_y[i])

  # 2. create and train model
  nt = 10   # number trees
  nr = 200  # number rows
  nc = 4    # number cols to use during splits
  md = 6    # max_depth
  ms = 2    # min_samples to consider a split

  print("\nSetting num_trees = " + str(nt))
  print("Setting n_rows = " + str(nr))
  print("Setting n_split_cols = " + str(nc))
  print("Setting max_depth = " + str(md))
  print("Setting min_samples = " + str(ms))

  print("\nCreating and training random forest model ")
  model = MyRandomForestRegressor(n_trees=nt, n_rows=nr,
    n_split_cols=nc, max_depth=md, min_samples=ms, seed=0)
  model.fit(train_X, train_y)
  print("Done ")

  # 3. evaluate model
  acc_train = accuracy(model, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(model, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(model, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(model, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  # 4. use model
  x = train_X[0].reshape(1,-1)
  print("\nPredicting for: ")
  print(x)
  y_pred = model.predict(x)
  print("Predicted y = %0.4f " % y_pred)

  print("\nEnd random forest regression scratch Python demo ")

  print("\n==================== ")

  print("\nUsing scikit RandomForestRegressor: ")
  from sklearn.ensemble import RandomForestRegressor

  print("\nCreating and training scikit Random Forest model ")
  rfr = RandomForestRegressor(n_estimators=10,
    max_depth=6, max_features=4, random_state=0)
  rfr.fit(train_X, train_y)
  print("Done ")

  acc_train = accuracy(rfr, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(rfr, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(rfr, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(rfr, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  x = train_X[0].reshape(1,-1)
  print("\nPredicting for: ")
  print(x)
  y_pred = rfr.predict(x)
  print("Predicted y = %0.4f " % y_pred)


if __name__ == "__main__":
  main()

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386