Decision Tree Regression (Without Recursion) From Scratch Using Python

A few days ago, I refactored an old implementation of decision tree regression from scratch using Python. But I wasn’t completely happy with it because it used recursion when building the tree. This is the standard approach, but I don’t like to use recursion because it’s error-prone, difficult to debug, and difficult to modify. So I made new version of a from-scratch Python decision tree regression demo that doesn’t use recursion — instead, the tree building code uses a stack.

For my updated implementation, I used a nested Node class which is defined inside of an outer MyDecisionTreeRegressor class. I prepended the “My” in front of the name so there wouldn’t be a name clash with the scikit-learn DecisionTreeRegressor module:

class MyDecisionTreeRegressor:  # avoid scikit name collision
  # if max_depth = 0, tree has just a root node with avg y
  # if max_depth = 1, tree has at most 3 nodes (root, l, r)
  # if max_depth = n, tree has at most 2^(n+1) - 1 nodes.

  def __init__(self, max_depth=3, min_samples=2,
    n_split_cols=-1, seed=0):
    self.max_depth = max_depth
    self.min_samples = min_samples # aka min_samples_split
    self.n_split_cols = n_split_cols  # mostly random forest
    self.root = None
    self.rnd = np.random.RandomState(seed) # split col order

  # ===============================================

  class Node:
    def __init__(self, id=0, col_idx=-1,
        thresh=0.0, left=None, right=None,
        value=0.0, is_leaf=False):
      self.id = id  # useful for debugging
      self.col_idx = col_idx
      self.thresh = thresh
      self.left = left
      self.right = right
      self.value = value
      self.is_leaf = is_leaf  # False for in-node

    def show(self): . . .

  # ===============================================

  def best_split(self, X, y): . . . 
  def vector_mse(self, y): . . .
  def make_tree(self, X, y, depth=0): . . .
  def fit(self, X, y): . . .
  def predict_one(self, x): . . .
  def predict(self, X): . . .
  def explain(self, x): . . .
  def display(self): . . .

In addition to getting rid of recursion, I changed the split algorithm slightly to minimize weighted mean squared error defined on a single vector (the same thing as variance), instead of maximizing variance reduction, in order to sync with the scikit DecisionTreeRegressor module implementation. Weirdly, in machine learning, MSE compares predicted values with actual target values, but in tree terminology, MSE is equivalent to statistical variance reduction. This is actually a very deep topic, but one for another day.

For my demo, I used a set of synthetic data that I generated using a neural network with random weights and biases. The data looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065, 0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562, 0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192, 0.8054
. . .

The first five values on each line are the predictors. The sixth value is the target to predict. All predictor values are between -1.0 and 1.0. Normalizing the predictor values is not necessary but is helpful when using the data with other regression techniques that require normalization (such as k-nearest neighbors regression). There are 200 items in the training data and 40 items in the test data.

The output of the from-scratch no-recursion decision tree regression demo program is:

Begin decision tree regression scratch Python

Loading synthetic train (200), test (40) data
Done

First three X predictors:
[[-0.1660  0.4406 -0.9998 -0.3953 -0.7065]
 [ 0.0776 -0.1616  0.3704 -0.5911  0.7562]
 [-0.9452  0.3409 -0.1654  0.1174 -0.7192]]

First three y targets:
0.4840
0.1568
0.8054

Creating tree model max_depth=6 min_samples=2

Accuracy train (within 0.10): 0.8150
Accuracy test (within 0.10): 0.4500

MSE train: 0.0004
MSE test: 0.0054

Predicting train_X[0] using scratch tree:
[0.4909]

IF
column 0  gt   -0.2102 AND
column 0  lte   0.3915 AND
column 4  lte  -0.2987 AND
column 0  lte   0.0090 AND
column 4  lte  -0.6966 AND
column 3  gt   -0.7393 AND
THEN
predicted = 0.4909

End demo

=================================

Using scikit:

Accuracy train (within 0.10): 0.8100
Accuracy test (within 0.10): 0.4000

MSE train: 0.0004
MSE test: 0.0057

Predicting train_X[0] using scikit:
[0.4909]

The output of my from-scratch Python version and the scikit library version are very close but not exactly the same because both my decision tree regressor version and the scikit version shuffle the order in which columns are searched when finding the split values. This prevents the first few columns from always being used when there are multiple splits that have the same mean squared error. But this introduces minor randomness into the tree models.

Sidebar: Variance Reduction vs. Mean Squared Error

The variance of a set of numbers is the average of the sum of the squared deviations from the mean of the numbers. For example, if v = (2.0, 7.0, 9.0) the mean is 6.0 and the variance is ((2.0 – 6)^2 + (7.0 – 6)^2 + (9.0 – 6)^2) / 3 = (16.0 + 1.0 + 9.0) / 3 = 26.0 / 3 = 8.67.

The mean squared error (MSE) of a set of predicted and target values is the average of the squared deviations between predicted and targets. For example, if some system predicts (4.0, 8.0, 5.0) and the associated three true targets are (6, 9, 7) then the mean squared error is (4.0 – 6)^2 + (8.0 – 9)^2 + (5.0 – 7)^2 / 3 = (4.0 + 1.0 + 4.0) / 3 = 9.0 / 3 = 3.0.

For reasons which aren’t at all clear to me, decision tree terminology, notably scikit documentation, uses the terms MSE (which applies to a set of predicteds and true targets) and variance reduction(which applies to a single set of numbers) interchangeably. The idea here is that variance reduction is the same as MSE if you assume there is a set of hidden true target values, that are all the same, and are equal to the mean of the set of numbers.

I implemented an accuracy() function that scores a predicted y value as correct if it’s within 0.10 of the true value. The accuracy of the decision tree model on the training data is good (81.50% — 163 out of 200 correct) but poor on the test data (45.00% — just 18 out of 40 correct). This is typical for decision trees because they usually overfit on the training data. And this is the reason why decision trees are almost never used by themselves. It’s far more common to combine many decision trees into an ensemble technique such as Bagging Tree, Random Forest, AdaBoost, or Gradient Boost.

For my updated demo, I added an explain() method that shows how an output is computed. Sometimes an explanation helps interpretability, but sometimes it doesn’t really help.

I also added a display() method to show a tree. For example, I set max_depth = 3, and min_samples = 2 and:

Creating tree model max_depth=3 min_samples=2

Tree:
id =    0   0  -0.2102    not_None    not_None   0.0000   False
id =    1   4   0.1431    not_None    not_None   0.0000   False
id =    2   0   0.3915    not_None    not_None   0.0000   False
id =    3   0  -0.4061    not_None    not_None   0.0000   False
id =    4   2   0.1128    not_None    not_None   0.0000   False
id =    5   4  -0.2987    not_None    not_None   0.0000   False
id =    6   1  -0.6099    not_None    not_None   0.0000   False
id =    7  -1   0.0000    None        None       0.6777   True
id =    8  -1   0.0000    None        None       0.4865   True
id =    9  -1   0.0000    None        None       0.4586   True
id =   10  -1   0.0000    None        None       0.3463   True
id =   11  -1   0.0000    None        None       0.4101   True
id =   12  -1   0.0000    None        None       0.2613   True
id =   13  -1   0.0000    None        None       0.1171   True
id =   14  -1   0.0000    None        None       0.1859   True

Because max_depth was set to n = 3, and the tree is full, there are 2^(n+1)-1 = 2^4 – 1 = 16 – 1 = 15 nodes. The fields on each line are ID, split column, split value, left child None or not, right child None or not, node predicted y value, is-leaf-node or not.

Satisfying to get rid of recursion. I feel better now.

I love all kinds of mathematical models. I also love railroad train models, and watch videos of them on the Internet. Some of the models are astonishingly detailed and beautiful.

Top: A screenshot of a video of the Central Florida Railroad Modelers club in Orlando, Florida.

Bottom: A screenshot of a video of the Colorado Model Railroad Museum in Greeley, Colorado (about 50 miles north of Denver).

Demo program. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My blog editor chokes on symbols).

# decision_tree_regression_scratch.py

# uses pointers (for nodes) but no recursion (during build)
# uses MSE for splitting (not explicit variance reduction)

import numpy as np

# ===========================================================

class MyDecisionTreeRegressor:  # avoid scikit collision

  # if max_depth = 0, tree has just a root node with avg y
  # if max_depth = 1, tree has at most 3 nodes (root, l, r)
  # if max_depth = n, tree has at most 2^(n+1) - 1 nodes.

  def __init__(self, max_depth=3, min_samples=2,
    n_split_cols=-1, seed=0):
    self.max_depth = max_depth
    self.min_samples = min_samples # aka min_samples_split
    self.n_split_cols = n_split_cols  # for random forest
    self.root = None
    self.rnd = np.random.RandomState(seed)

  # ===============================================

  class Node:
    def __init__(self, id=0, col_idx=-1, thresh=0.0,
        left=None, right=None,
        value=0.0, is_leaf=False):
      self.id = id  # useful for debugging
      self.col_idx = col_idx
      self.thresh = thresh
      self.left = left
      self.right = right
      self.value = value
      self.is_leaf = is_leaf  # False for in-node

    def show(self):
      s1 = "id = %4d " % self.id
      s2 = "%3d " % self.col_idx
      s3 = "%8.4f " % self.thresh
      s4 = "   None     "
      if self.left is not None: s4 = "   not_None "
      s5 = "   None     "
      if self.right is not None: s5 = "   not_None "
      s6 = "%8.4f " % self.value
      s7 = "  " + str(self.is_leaf)
      print(s1+s2+s3+s4+s5+s6+s7)

  # ===============================================

  def best_split(self, X, y):
    # best split using semi MSE
    best_col_idx = -1  # indicates a bad split
    best_thresh = 0.0
    best_mse = np.inf  # smaller is better
    n_rows, n_cols = X.shape
    # n_rows should never be 0

    rnd_cols = np.arange(n_cols)
    self.rnd.shuffle(rnd_cols)
    if self.n_split_cols != -1:  # just use some cols
      rnd_cols = rnd_cols[0:self.n_split_cols]

    for j in range(len(rnd_cols)):
      col_idx = rnd_cols[j]
      examined_threshs = set()
      for i in range(n_rows):
        thresh = X[i][col_idx]  # candidate threshold value

        # if thresh value has already been checked, skip it
        if thresh in examined_threshs == True:
          continue

        examined_threshs.add(thresh)

        # get rows where x is lte, gt thresh

        # a. non-Python way:
        # left_idxs = []
        # right_idxs = []
        # for r in range(len(X)):
        #   if X[r][col_idx] "lte" thresh:
        #    left_idxs.append(r)
        #   else:
        #     right_idxs.append(r)

        # b. Python way WRONG (cannot check len):
        # left_idxs = X[:,col_idx] "lte" thresh
        # right_idxs = ~left_idxs

        # b. Python way
        left_idxs = np.where(X[:,col_idx] "lte" thresh)[0]
        right_idxs = np.where(X[:,col_idx] "gt" thresh)[0]

        # check if proposed split would lead to an empty
        # node, which would cause np.mean() to fail
        # when building the tree.
        # But allow a node with a single value.
        # Equivalent to scikit min_samples_leaf = 1.

        if len(left_idxs) == 0 or \
          len(right_idxs) == 0:
          continue  #  

        # get left and right y values

        # a. non-Python way:
        # left_y_vals = []
        # right_y_vals = []
        # for k in range(len(left_idxs)):
        #   left_y_vals.append(y[left_idxs[k]])
        # for k in range(len(right_idxs)):
        #   right_y_vals.append(y[right_idxs[k]])

        # b. Python way:
        left_y_vals = y[left_idxs]  # not empty
        right_y_vals = y[right_idxs]  # not empty

        # compute proposed split MSE

        # a. use a helper function for MSEs
        mse_left = self.vector_mse(left_y_vals)
        mse_right = self.vector_mse(right_y_vals)

        # b. compute MSEs directly
        # if len(left_y_vals) == 0:
        #   mse_left = 0.0
        # else:
        #   mse_left = \
        #   np.mean((left_y_vals - np.mean(left_y_vals))**2)
        # if len(right_y_vals) == 0:
        #   mse_right = 0.0
        # else:
        #   mse_right = \
        #   np.mean((right_y_vals - np.mean(right_y_vals))**2)

        # overall weighted MSE of proposed split
        split_mse = (len(left_y_vals) * mse_left + \
          len(right_y_vals) * mse_right) / n_rows

        if split_mse "lt" best_mse:
          best_col_idx = col_idx
          best_thresh = thresh
          best_mse = split_mse          

    return best_col_idx, best_thresh  # -1 is bad/no split

  # ---------------------------------------------------------

  def vector_mse(self, y):  # variance but called MSE
    if len(y) == 0: return 0.0  # should never get here
    return np.var(y)

  # ---------------------------------------------------------

  def make_tree(self, X, y):
    root = self.Node()  # is_leaf is False
    stack = [(root, X, y, 0)]  # curr depth = 0

    while (len(stack) "gt" 0):
      curr_node, curr_X, curr_y, curr_depth = stack.pop()

      if curr_depth == self.max_depth or \
        len(curr_y) "lt" self.min_samples:
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue

      col_idx, thresh = self.best_split(curr_X, curr_y) 

      if col_idx == -1:  # cannot split
        curr_node.value = np.mean(curr_y)
        curr_node.is_leaf = True
        continue
      
      # got a good split so at an internal, non-leaf node
      curr_node.col_idx = col_idx
      curr_node.thresh = thresh

      # create and attach child nodes
      # Python fancy syntax
      # left_idxs = curr_X[:, col_idx] "lte" thresh
      # right_idxs = ~left_idxs  # OK because no need len
      left_idxs = np.where(curr_X[:,col_idx] "lte" thresh)[0]
      right_idxs = np.where(curr_X[:,col_idx] "gt" thresh)[0]

      left_X = curr_X[left_idxs,:]
      left_y = curr_y[left_idxs]
      right_X = curr_X[right_idxs,:]
      right_y = curr_y[right_idxs]

      curr_node.left = self.Node(id=2*curr_node.id+1)
      stack.append((curr_node.left,
        left_X, left_y, curr_depth+1))

      curr_node.right = self.Node(id=2*curr_node.id+2)
      stack.append((curr_node.right,
        right_X, right_y, curr_depth+1))
      
    return root

  # ---------------------------------------------------------

  def fit(self, X, y):
    self.root = self.make_tree(X, y)

  # ---------------------------------------------------------

  def predict_one(self, x):
    curr = self.root
    while curr.is_leaf == False:
      if x[curr.col_idx] "lte" curr.thresh:
        curr = curr.left
      else:
        curr = curr.right
    return curr.value

  def predict(self, X):  # scikit always uses a matrix input
    result = np.zeros(len(X), dtype=np.float64)
    for i in range(len(X)):
      result[i] = self.predict_one(X[i])
    return result

  # ---------------------------------------------------------

  def explain(self, x):
    print("\nIF ");
    curr = self.root;
    while curr.is_leaf == False:
      print("column ", end="")
      print(str(curr.col_idx) + " ", end="")
      if x[curr.col_idx] "lte" curr.thresh:
        print(" "lte" ", end="");
        print("%8.4f AND " %  curr.thresh)
        curr = curr.left
      else:
        print(" "gt"  ", end="")
        print("%8.4f AND " %  curr.thresh)
        curr = curr.right;

    print("THEN \npredicted = %0.4f " % curr.value)

  # ---------------------------------------------------------

  def display(self):
    queue = []
    queue.append(self.root)
    while len(queue) "gt" 0:
      node = queue.pop(0)
      node.show()
      if node.left is not None:
        queue.append(node.left)
      if node.right is not None:
        queue.append(node.right)

# ===========================================================

# -----------------------------------------------------------

def accuracy(model, data_X, data_y, pct_close):
  n = len(data_X)
  n_correct = 0; n_wrong = 0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)

    if np.abs(y - y_pred) "lt" np.abs(y * pct_close):
      n_correct += 1
    else: 
      n_wrong += 1
  # print("Correct = " + str(n_correct))
  # print("Wrong   = " + str(n_wrong))
  return n_correct / (n_correct + n_wrong)

# -----------------------------------------------------------

def MSE(model, data_X, data_y):
  n = len(data_X)
  sum = 0.0
  for i in range(n):
    x = data_X[i].reshape(1,-1)
    y = data_y[i]
    y_pred = model.predict(x)
    sum += (y - y_pred) * (y - y_pred)

  return sum / n

# -----------------------------------------------------------

def main():
  print("\nBegin decision tree regression " + \
    "(no recursion) scratch Python ")

  np.set_printoptions(precision=4, suppress=True,
    floatmode='fixed')
  np.random.seed(0)  # not used this version

  # 1. load data
  print("\nLoading synthetic train (200), test (40) data ")
  train_file = ".\\Data\\synthetic_train_200.txt"
  # -0.1660,  0.4406, -0.9998, -0.3953, -0.7065, 0.4840
  #  0.0776, -0.1616,  0.3704, -0.5911,  0.7562, 0.1568
  # -0.9452,  0.3409, -0.1654,  0.1174  -0.7192, 0.8054
  # . . .

  train_X = np.loadtxt(train_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  train_y = np.loadtxt(train_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)

  test_file = ".\\Data\\synthetic_test_40.txt"
  test_X = np.loadtxt(test_file, comments="#",
    usecols=[0,1,2,3,4],
    delimiter=",",  dtype=np.float64)
  test_y = np.loadtxt(test_file, comments="#", usecols=5,
    delimiter=",",  dtype=np.float64)
  print("Done ")

  print("\nFirst three X predictors: ")
  print(train_X[0:3,:])
  print("\nFirst three y targets: ")
  for i in range(3):
    print("%0.4f" % train_y[i])

  md = 6  # max_depth
  ms = 2  # min_samples to consider a split

  print("\nCreating tree model max_depth=" + str(md) + \
    " min_samples=" + str(ms))
  tree = MyDecisionTreeRegressor(max_depth=md, 
    min_samples=ms, seed=0)
  tree.fit(train_X, train_y)

  # print("\nTree: ")
  # tree.display()

  acc_train = accuracy(tree, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(tree, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(tree, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(tree, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  print("\nPredicting train_X[0] using scratch tree: ")
  x = train_X[0].reshape(1,-1)
  y_pred = tree.predict(x)  # predict() wants a matrix
  print(y_pred)

  tree.explain(train_X[0])  # explain() wants a vector

  print("\nEnd demo ")

  print("\n================================= ")

  print("\nUsing scikit: ")
  from sklearn.tree import DecisionTreeRegressor
  sci_tree = DecisionTreeRegressor(max_depth=md,
    min_samples_split=ms, random_state=0)
  sci_tree.fit(train_X, train_y)

  acc_train = accuracy(sci_tree, train_X, train_y, 0.10)
  print("\nAccuracy train (within 0.10): %0.4f " % acc_train)
  acc_test = accuracy(sci_tree, test_X, test_y, 0.10)
  print("Accuracy test (within 0.10): %0.4f " % acc_test)

  mse_train = MSE(sci_tree, train_X, train_y)
  print("\nMSE train: %0.4f " % mse_train)
  mse_test = MSE(sci_tree, test_X, test_y)
  print("MSE test: %0.4f " % mse_test)

  print("\nPredicting train_X[0] using scikit: ")
  y_pred = sci_tree.predict(x)
  print(y_pred)

if __name__ == "__main__":
  main()

Training data:

# synthetic_train_200.txt
#
-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
-0.8299, -0.9219, -0.6603,  0.7563, -0.8033,  0.7955
 0.0663,  0.3838, -0.3690,  0.3730,  0.6693,  0.3206
-0.9634,  0.5003,  0.9777,  0.4963, -0.4391,  0.7377
-0.1042,  0.8172, -0.4128, -0.4244, -0.7399,  0.4801
-0.9613,  0.3577, -0.5767, -0.4689, -0.0169,  0.6861
-0.7065,  0.1786,  0.3995, -0.7953, -0.1719,  0.5569
 0.3888, -0.1716, -0.9001,  0.0718,  0.3276,  0.2500
 0.1731,  0.8068, -0.7251, -0.7214,  0.6148,  0.3297
-0.2046, -0.6693,  0.8550, -0.3045,  0.5016,  0.2129
 0.2473,  0.5019, -0.3022, -0.4601,  0.7918,  0.2613
-0.1438,  0.9297,  0.3269,  0.2434, -0.7705,  0.5171
 0.1568, -0.1837, -0.5259,  0.8068,  0.1474,  0.3307
-0.9943,  0.2343, -0.3467,  0.0541,  0.7719,  0.5581
 0.2467, -0.9684,  0.8589,  0.3818,  0.9946,  0.1092
-0.6553, -0.7257,  0.8652,  0.3936, -0.8680,  0.7018
 0.8460,  0.4230, -0.7515, -0.9602, -0.9476,  0.1996
-0.9434, -0.5076,  0.7201,  0.0777,  0.1056,  0.5664
 0.9392,  0.1221, -0.9627,  0.6013, -0.5341,  0.1533
 0.6142, -0.2243,  0.7271,  0.4942,  0.1125,  0.1661
 0.4260,  0.1194, -0.9749, -0.8561,  0.9346,  0.2230
 0.1362, -0.5934, -0.4953,  0.4877, -0.6091,  0.3810
 0.6937, -0.5203, -0.0125,  0.2399,  0.6580,  0.1460
-0.6864, -0.9628, -0.8600, -0.0273,  0.2127,  0.5387
 0.9772,  0.1595, -0.2397,  0.1019,  0.4907,  0.1611
 0.3385, -0.4702, -0.8673, -0.2598,  0.2594,  0.2270
-0.8669, -0.4794,  0.6095, -0.6131,  0.2789,  0.4700
 0.0493,  0.8496, -0.4734, -0.8681,  0.4701,  0.3516
 0.8639, -0.9721, -0.5313,  0.2336,  0.8980,  0.1412
 0.9004,  0.1133,  0.8312,  0.2831, -0.2200,  0.1782
 0.0991,  0.8524,  0.8375, -0.2102,  0.9265,  0.2150
-0.6521, -0.7473, -0.7298,  0.0113, -0.9570,  0.7422
 0.6190, -0.3105,  0.8802,  0.1640,  0.7577,  0.1056
 0.6895,  0.8108, -0.0802,  0.0927,  0.5972,  0.2214
 0.1982, -0.9689,  0.1870, -0.1326,  0.6147,  0.1310
-0.3695,  0.7858,  0.1557, -0.6320,  0.5759,  0.3773
-0.1596,  0.3581,  0.8372, -0.9992,  0.9535,  0.2071
-0.2468,  0.9476,  0.2094,  0.6577,  0.1494,  0.4132
 0.1737,  0.5000,  0.7166,  0.5102,  0.3961,  0.2611
 0.7290, -0.3546,  0.3416, -0.0983, -0.2358,  0.1332
-0.3652,  0.2438, -0.1395,  0.9476,  0.3556,  0.4170
-0.6029, -0.1466, -0.3133,  0.5953,  0.7600,  0.4334
-0.4596, -0.4953,  0.7098,  0.0554,  0.6043,  0.2775
 0.1450,  0.4663,  0.0380,  0.5418,  0.1377,  0.2931
-0.8636, -0.2442, -0.8407,  0.9656, -0.6368,  0.7429
 0.6237,  0.7499,  0.3768,  0.1390, -0.6781,  0.2185
-0.5499,  0.1850, -0.3755,  0.8326,  0.8193,  0.4399
-0.4858, -0.7782, -0.6141, -0.0008,  0.4572,  0.4197
 0.7033, -0.1683,  0.2334, -0.5327, -0.7961,  0.1776
 0.0317, -0.0457, -0.6947,  0.2436,  0.0880,  0.3345
 0.5031, -0.5559,  0.0387,  0.5706, -0.9553,  0.3107
-0.3513,  0.7458,  0.6894,  0.0769,  0.7332,  0.3170
 0.2205,  0.5992, -0.9309,  0.5405,  0.4635,  0.3532
-0.4806, -0.4859,  0.2646, -0.3094,  0.5932,  0.3202
 0.9809, -0.3995, -0.7140,  0.8026,  0.0831,  0.1600
 0.9495,  0.2732,  0.9878,  0.0921,  0.0529,  0.1289
-0.9476, -0.6792,  0.4913, -0.9392, -0.2669,  0.5966
 0.7247,  0.3854,  0.3819, -0.6227, -0.1162,  0.1550
-0.5922, -0.5045, -0.4757,  0.5003, -0.0860,  0.5863
-0.8861,  0.0170, -0.5761,  0.5972, -0.4053,  0.7301
 0.6877, -0.2380,  0.4997,  0.0223,  0.0819,  0.1404
 0.9189,  0.6079, -0.9354,  0.4188, -0.0700,  0.1907
-0.1428, -0.7820,  0.2676,  0.6059,  0.3936,  0.2790
 0.5324, -0.3151,  0.6917, -0.1425,  0.6480,  0.1071
-0.8432, -0.9633, -0.8666, -0.0828, -0.7733,  0.7784
-0.9444,  0.5097, -0.2103,  0.4939, -0.0952,  0.6787
-0.0520,  0.6063, -0.1952,  0.8094, -0.9259,  0.4836
 0.5477, -0.7487,  0.2370, -0.9793,  0.0773,  0.1241
 0.2450,  0.8116,  0.9799,  0.4222,  0.4636,  0.2355
 0.8186, -0.1983, -0.5003, -0.6531, -0.7611,  0.1511
-0.4714,  0.6382, -0.3788,  0.9648, -0.4667,  0.5950
 0.0673, -0.3711,  0.8215, -0.2669, -0.1328,  0.2677
-0.9381,  0.4338,  0.7820, -0.9454,  0.0441,  0.5518
-0.3480,  0.7190,  0.1170,  0.3805, -0.0943,  0.4724
-0.9813,  0.1535, -0.3771,  0.0345,  0.8328,  0.5438
-0.1471, -0.5052, -0.2574,  0.8637,  0.8737,  0.3042
-0.5454, -0.3712, -0.6505,  0.2142, -0.1728,  0.5783
 0.6327, -0.6297,  0.4038, -0.5193,  0.1484,  0.1153
-0.5424,  0.3282, -0.0055,  0.0380, -0.6506,  0.6613
 0.1414,  0.9935,  0.6337,  0.1887,  0.9520,  0.2540
-0.9351, -0.8128, -0.8693, -0.0965, -0.2491,  0.7353
 0.9507, -0.6640,  0.9456,  0.5349,  0.6485,  0.1059
-0.0462, -0.9737, -0.2940, -0.0159,  0.4602,  0.2606
-0.0627, -0.0852, -0.7247, -0.9782,  0.5166,  0.2977
 0.0478,  0.5098, -0.0723, -0.7504, -0.3750,  0.3335
 0.0090,  0.3477,  0.5403, -0.7393, -0.9542,  0.4415
-0.9748,  0.3449,  0.3736, -0.1015,  0.8296,  0.4358
 0.2887, -0.9895, -0.0311,  0.7186,  0.6608,  0.2057
 0.1570, -0.4518,  0.1211,  0.3435, -0.2951,  0.3244
 0.7117, -0.6099,  0.4946, -0.4208,  0.5476,  0.1096
-0.2929, -0.5726,  0.5346, -0.3827,  0.4665,  0.2465
 0.4889, -0.5572, -0.5718, -0.6021, -0.7150,  0.2163
-0.7782,  0.3491,  0.5996, -0.8389, -0.5366,  0.6516
-0.5847,  0.8347,  0.4226,  0.1078, -0.3910,  0.6134
 0.8469,  0.4121, -0.0439, -0.7476,  0.9521,  0.1571
-0.6803, -0.5948, -0.1376, -0.1916, -0.7065,  0.7156
 0.2878,  0.5086, -0.5785,  0.2019,  0.4979,  0.2980
 0.2764,  0.1943, -0.4090,  0.4632,  0.8906,  0.2960
-0.8877,  0.6705, -0.6155, -0.2098, -0.3998,  0.7107
-0.8398,  0.8093, -0.2597,  0.0614, -0.0118,  0.6502
-0.8476,  0.0158, -0.4769, -0.2859, -0.7839,  0.7715
 0.5751, -0.7868,  0.9714, -0.6457,  0.1448,  0.1175
 0.4802, -0.7001,  0.1022, -0.5668,  0.5184,  0.1090
 0.4458, -0.6469,  0.7239, -0.9604,  0.7205,  0.0779
 0.5175,  0.4339,  0.9747, -0.4438, -0.9924,  0.2879
 0.8678,  0.7158,  0.4577,  0.0334,  0.4139,  0.1678
 0.5406,  0.5012,  0.2264, -0.1963,  0.3946,  0.2088
-0.9938,  0.5498,  0.7928, -0.5214, -0.7585,  0.7687
 0.7661,  0.0863, -0.4266, -0.7233, -0.4197,  0.1466
 0.2277, -0.3517, -0.0853, -0.1118,  0.6563,  0.1767
 0.3499, -0.5570, -0.0655, -0.3705,  0.2537,  0.1632
 0.7547, -0.1046,  0.5689, -0.0861,  0.3125,  0.1257
 0.8186,  0.2110,  0.5335,  0.0094, -0.0039,  0.1391
 0.6858, -0.8644,  0.1465,  0.8855,  0.0357,  0.1845
-0.4967,  0.4015,  0.0805,  0.8977,  0.2487,  0.4663
 0.6760, -0.9841,  0.9787, -0.8446, -0.3557,  0.1509
-0.1203, -0.4885,  0.6054, -0.0443, -0.7313,  0.4854
 0.8557,  0.7919, -0.0169,  0.7134, -0.1628,  0.2002
 0.0115, -0.6209,  0.9300, -0.4116, -0.7931,  0.4052
-0.7114, -0.9718,  0.4319,  0.1290,  0.5892,  0.3661
 0.3915,  0.5557, -0.1870,  0.2955, -0.6404,  0.2954
-0.3564, -0.6548, -0.1827, -0.5172, -0.1862,  0.4622
 0.2392, -0.4959,  0.5857, -0.1341, -0.2850,  0.2470
-0.3394,  0.3947, -0.4627,  0.6166, -0.4094,  0.5325
 0.7107,  0.7768, -0.6312,  0.1707,  0.7964,  0.2757
-0.1078,  0.8437, -0.4420,  0.2177,  0.3649,  0.4028
-0.3139,  0.5595, -0.6505, -0.3161, -0.7108,  0.5546
 0.4335,  0.3986,  0.3770, -0.4932,  0.3847,  0.1810
-0.2562, -0.2894, -0.8847,  0.2633,  0.4146,  0.4036
 0.2272,  0.2966, -0.6601, -0.7011,  0.0284,  0.2778
-0.0743, -0.1421, -0.0054, -0.6770, -0.3151,  0.3597
-0.4762,  0.6891,  0.6007, -0.1467,  0.2140,  0.4266
-0.4061,  0.7193,  0.3432,  0.2669, -0.7505,  0.6147
-0.0588,  0.9731,  0.8966,  0.2902, -0.6966,  0.4955
-0.0627, -0.1439,  0.1985,  0.6999,  0.5022,  0.3077
 0.1587,  0.8494, -0.8705,  0.9827, -0.8940,  0.4263
-0.7850,  0.2473, -0.9040, -0.4308, -0.8779,  0.7199
 0.4070,  0.3369, -0.2428, -0.6236,  0.4940,  0.2215
-0.0242,  0.0513, -0.9430,  0.2885, -0.2987,  0.3947
-0.5416, -0.1322, -0.2351, -0.0604,  0.9590,  0.3683
 0.1055,  0.7783, -0.2901, -0.5090,  0.8220,  0.2984
-0.9129,  0.9015,  0.1128, -0.2473,  0.9901,  0.4776
-0.9378,  0.1424, -0.6391,  0.2619,  0.9618,  0.5368
 0.7498, -0.0963,  0.4169,  0.5549, -0.0103,  0.1614
-0.2612, -0.7156,  0.4538, -0.0460, -0.1022,  0.3717
 0.7720,  0.0552, -0.1818, -0.4622, -0.8560,  0.1685
-0.4177,  0.0070,  0.9319, -0.7812,  0.3461,  0.3052
-0.0001,  0.5542, -0.7128, -0.8336, -0.2016,  0.3803
 0.5356, -0.4194, -0.5662, -0.9666, -0.2027,  0.1776
-0.2378,  0.3187, -0.8582, -0.6948, -0.9668,  0.5474
-0.1947, -0.3579,  0.1158,  0.9869,  0.6690,  0.2992
 0.3992,  0.8365, -0.9205, -0.8593, -0.0520,  0.3154
-0.0209,  0.0793,  0.7905, -0.1067,  0.7541,  0.1864
-0.4928, -0.4524, -0.3433,  0.0951, -0.5597,  0.6261
-0.8118,  0.7404, -0.5263, -0.2280,  0.1431,  0.6349
 0.0516, -0.8480,  0.7483,  0.9023,  0.6250,  0.1959
-0.3212,  0.1093,  0.9488, -0.3766,  0.3376,  0.2735
-0.3481,  0.5490, -0.3484,  0.7797,  0.5034,  0.4379
-0.5785, -0.9170, -0.3563, -0.9258,  0.3877,  0.4121
 0.3407, -0.1391,  0.5356,  0.0720, -0.9203,  0.3458
-0.3287, -0.8954,  0.2102,  0.0241,  0.2349,  0.3247
-0.1353,  0.6954, -0.0919, -0.9692,  0.7461,  0.3338
 0.9036, -0.8982, -0.5299, -0.8733, -0.1567,  0.1187
 0.7277, -0.8368, -0.0538, -0.7489,  0.5458,  0.0830
 0.9049,  0.8878,  0.2279,  0.9470, -0.3103,  0.2194
 0.7957, -0.1308, -0.5284,  0.8817,  0.3684,  0.2172
 0.4647, -0.4931,  0.2010,  0.6292, -0.8918,  0.3371
-0.7390,  0.6849,  0.2367,  0.0626, -0.5034,  0.7039
-0.1567, -0.8711,  0.7940, -0.5932,  0.6525,  0.1710
 0.7635, -0.0265,  0.1969,  0.0545,  0.2496,  0.1445
 0.7675,  0.1354, -0.7698, -0.5460,  0.1920,  0.1728
-0.5211, -0.7372, -0.6763,  0.6897,  0.2044,  0.5217
 0.1913,  0.1980,  0.2314, -0.8816,  0.5006,  0.1998
 0.8964,  0.0694, -0.6149,  0.5059, -0.9854,  0.1825
 0.1767,  0.7104,  0.2093,  0.6452,  0.7590,  0.2832
-0.3580, -0.7541,  0.4426, -0.1193, -0.7465,  0.5657
-0.5996,  0.5766, -0.9758, -0.3933, -0.9572,  0.6800
 0.9950,  0.1641, -0.4132,  0.8579,  0.0142,  0.2003
-0.4717, -0.3894, -0.2567, -0.5111,  0.1691,  0.4266
 0.3917, -0.8561,  0.9422,  0.5061,  0.6123,  0.1212
-0.0366, -0.1087,  0.3449, -0.1025,  0.4086,  0.2475
 0.3633,  0.3943,  0.2372, -0.6980,  0.5216,  0.1925
-0.5325, -0.6466, -0.2178, -0.3589,  0.6310,  0.3568
 0.2271,  0.5200, -0.1447, -0.8011, -0.7699,  0.3128
 0.6415,  0.1993,  0.3777, -0.0178, -0.8237,  0.2181
-0.5298, -0.0768, -0.6028, -0.9490,  0.4588,  0.4356
 0.6870, -0.1431,  0.7294,  0.3141,  0.1621,  0.1632
-0.5985,  0.0591,  0.7889, -0.3900,  0.7419,  0.2945
 0.3661,  0.7984, -0.8486,  0.7572, -0.6183,  0.3449
 0.6995,  0.3342, -0.3113, -0.6972,  0.2707,  0.1712
 0.2565,  0.9126,  0.1798, -0.6043, -0.1413,  0.2893
-0.3265,  0.9839, -0.2395,  0.9854,  0.0376,  0.4770
 0.2690, -0.1722,  0.9818,  0.8599, -0.7015,  0.3954
-0.2102, -0.0768,  0.1219,  0.5607, -0.0256,  0.3949
 0.8216, -0.9555,  0.6422, -0.6231,  0.3715,  0.0801
-0.2896,  0.9484, -0.7545, -0.6249,  0.7789,  0.4370
-0.9985, -0.5448, -0.7092, -0.5931,  0.7926,  0.5402

Test data:

# synthetic_test_40.txt
#
 0.7462,  0.4006, -0.0590,  0.6543, -0.0083,  0.1935
 0.8495, -0.2260, -0.0142, -0.4911,  0.7699,  0.1078
-0.2335, -0.4049,  0.4352, -0.6183, -0.7636,  0.5088
 0.1810, -0.5142,  0.2465,  0.2767, -0.3449,  0.3136
-0.8650,  0.7611, -0.0801,  0.5277, -0.4922,  0.7140
-0.2358, -0.7466, -0.5115, -0.8413, -0.3943,  0.4533
 0.4834,  0.2300,  0.3448, -0.9832,  0.3568,  0.1360
-0.6502, -0.6300,  0.6885,  0.9652,  0.8275,  0.3046
-0.3053,  0.5604,  0.0929,  0.6329, -0.0325,  0.4756
-0.7995,  0.0740, -0.2680,  0.2086,  0.9176,  0.4565
-0.2144, -0.2141,  0.5813,  0.2902, -0.2122,  0.4119
-0.7278, -0.0987, -0.3312, -0.5641,  0.8515,  0.4438
 0.3793,  0.1976,  0.4933,  0.0839,  0.4011,  0.1905
-0.8568,  0.9573, -0.5272,  0.3212, -0.8207,  0.7415
-0.5785,  0.0056, -0.7901, -0.2223,  0.0760,  0.5551
 0.0735, -0.2188,  0.3925,  0.3570,  0.3746,  0.2191
 0.1230, -0.2838,  0.2262,  0.8715,  0.1938,  0.2878
 0.4792, -0.9248,  0.5295,  0.0366, -0.9894,  0.3149
-0.4456,  0.0697,  0.5359, -0.8938,  0.0981,  0.3879
 0.8629, -0.8505, -0.4464,  0.8385,  0.5300,  0.1769
 0.1995,  0.6659,  0.7921,  0.9454,  0.9970,  0.2330
-0.0249, -0.3066, -0.2927, -0.4923,  0.8220,  0.2437
 0.4513, -0.9481, -0.0770, -0.4374, -0.9421,  0.2879
-0.3405,  0.5931, -0.3507, -0.3842,  0.8562,  0.3987
 0.9538,  0.0471,  0.9039,  0.7760,  0.0361,  0.1706
-0.0887,  0.2104,  0.9808,  0.5478, -0.3314,  0.4128
-0.8220, -0.6302,  0.0537, -0.1658,  0.6013,  0.4306
-0.4123, -0.2880,  0.9074, -0.0461, -0.4435,  0.5144
 0.0060,  0.2867, -0.7775,  0.5161,  0.7039,  0.3599
-0.7968, -0.5484,  0.9426, -0.4308,  0.8148,  0.2979
 0.7811,  0.8450, -0.6877,  0.7594,  0.2640,  0.2362
-0.6802, -0.1113, -0.8325, -0.6694, -0.6056,  0.6544
 0.3821,  0.1476,  0.7466, -0.5107,  0.2592,  0.1648
 0.7265,  0.9683, -0.9803, -0.4943, -0.5523,  0.2454
-0.9049, -0.9797, -0.0196, -0.9090, -0.4433,  0.6447
-0.4607,  0.1811, -0.2389,  0.4050, -0.0078,  0.5229
 0.2664, -0.2932, -0.4259, -0.7336,  0.8742,  0.1834
-0.4507,  0.1029, -0.6294, -0.1158, -0.6294,  0.6081
 0.8948, -0.0124,  0.9278,  0.2899, -0.0314,  0.1534
-0.1323, -0.8813, -0.0146, -0.0697,  0.6135,  0.2386