“Decision Tree Regression from Scratch with Pointers Using C#” in Visual Studio Magazine

I wrote an article titled “Decision Tree Regression from Scratch with Pointers Using C#” in the November 2025 issue of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2025/11/20/decision-tree-regression-from-scratch-with-pointers-using-csharp.aspx.

Decision tree regression is a machine learning technique that encapsulates a set of if-then rules in a tree data structure to predict a single numeric value. For example, a decision tree regression model prediction might be, “If employee age is greater than 43.0 and age is less than or equal to 51.5 and years-experience is less than or equal to 20.0 and height is greater than 58.0 then bank account balance is $845.41.”

The article presents decision tree regression, implemented from scratch with C#, using pointers/references for tree nodes (instead of a list), no recursion when building the tree (instead of using recursion), and using mean squared error minimaztion for the node split function (instead of variance reduction maximization).

The output of the demo program is:

Begin decision tree regression

Loading synthetic train (200) and test (40) data
Done

First three train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

First three train y:
  0.4840
  0.1568
  0.8054

Setting maxDepth = 3
Setting minSamples = 2
Setting minLeaf = 18
Creating and training tree
Done

Tree:
ID 0      0  -0.2102   not_null   not_null   0.0000  False
ID 1      4   0.1431   not_null   not_null   0.0000  False
ID 2      0   0.3915   not_null   not_null   0.0000  False
ID 3      0  -0.6553   not_null   not_null   0.0000  False
ID 4     -1   0.0000   null       null       0.4123   True
ID 5      4  -0.2987   not_null   not_null   0.0000  False
ID 6      2   0.3777   not_null   not_null   0.0000  False
ID 7     -1   0.0000   null       null       0.6952   True
ID 8     -1   0.0000   null       null       0.5598   True
ID 11    -1   0.0000   null       null       0.4101   True
ID 12    -1   0.0000   null       null       0.2613   True
ID 13    -1   0.0000   null       null       0.1882   True
ID 14    -1   0.0000   null       null       0.1381   True

Evaluating model
Accuracy train (within 0.10) = 0.3750
Accuracy test (within 0.10) = 0.4750

MSE train = 0.0048
MSE test = 0.0054

Predicting for trainX[0] =
  -0.1660   0.4406  -0.9998  -0.3953  -0.7065
Predicted y = 0.4101

IF
column 0  greater-than   -0.2102 AND
column 0  less-or-equal   0.3915 AND
column 4  less-or-equal  -0.2987 AND
THEN
predicted = 0.4101

End demo

Implementing decision tree regression from scratch requires a bit of effort, but it allows you to easily integrate a prediction model with other systems, and easily modify the system.

The biggest disadvantage of a simple decision tree regression system is that a single tree usually overfits the training data. Because of the overfitting problem, multiple tress are usually combined into an ensemble model such as bagging tree regression, random forest regression, adaptive boosting regression, or gradient boosting regression.

The demo implementation uses pointers/references to connect the nodes in the decision tree. But because references are memory addresses, they vanish after the program finishes, and so you can’t directly save a trained model. In situations where a tree must be saved, you can serialize the tree nodes into a list data structure where the memory pointers/references are replaced by integer index values into the list.



Machine learning decision trees are not plants and they do not have any inherent goodness or badness. But they’re fun to experiment with.

“Terra” is a German series of over 1,500 science fiction novels published that were published one-per-week between 1957 and 1986. Some of the covers of Terra novels featured alien trees and plants where it’s pretty clear that you shouldn’t mess around with them.


This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply