"Decision Tree Regression from Scratch Using JavaScript" in Visual Studio Magazine

I wrote an article titled “Decision Tree Regression from Scratch Using JavaScript” in the April 2026 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2026/04/01/decision-tree-regression-from-scratch-using-javascript.aspx.

Decision tree regression is a machine learning technique that incorporates a set of if-then rules in a tree data structure to predict a single numeric value. For example, a decision tree regression model prediction might be, “If employee age is greater than 28.0 and age is less than or equal to 32.5 and years-experience is less than or equal to 5.0 and height is greater than 66.0 then bank account balance is $833.127.”

There are many ways to implement a decision tree regression system. Four significant design decisions are 1.) use pointers/references for tree nodes, or use list storage, 2.) use recursion to construct the tree, or use a non-recursive stack, or use list iteration, 3.) use weighted variance minimization for the node split function, or use variance reduction maximization, 4.) explicitly store row indices associated with each node in the nodes, or discard the associated rows information after use during training. There are pros and cons to each option.

The article presents decision tree regression, implemented from scratch with JavaScript, using list storage for tree nodes (no pointers/references), list iteration to build the tree (no recursion or stack), weighted variance minimization for the node split function, and saves associated row indices in each node. This design has maximum flexibility.

The output of the demo program is:

JavaScript decision tree regression

Loading synthetic train (200), test (40)
Done

First three train x:
-0.1660   0.4406  -0.9998  -0.3953  -0.7065
 0.0776  -0.1616   0.3704  -0.5911   0.7562
-0.9452   0.3409  -0.1654   0.1174  -0.7192

First three train y:
0.484 0.1568 0.8054

Setting maxDepth = 3
Setting minSamples = 2
Setting minLeaf = 18
Using default numSplitCols = -1

Creating and training tree
Done

ID   0 | col  0 | v -0.2102 | L  1 | R  2 | y 0.3493 | leaf F
ID   1 | col  4 | v  0.1431 | L  3 | R  4 | y 0.5345 | leaf F
ID   2 | col  0 | v  0.3915 | L  5 | R  6 | y 0.2382 | leaf F
ID   3 | col  0 | v  0.6553 | L  7 | R  8 | y 0.6358 | leaf F
ID   4 | col -1 | v  0.0000 | L  9 | R 10 | y 0.4123 | leaf T
ID   5 | col  4 | v -0.2987 | L 11 | R 12 | y 0.3032 | leaf F
ID   6 | col  2 | v  0.3777 | L 13 | R 14 | y 0.1701 | leaf F
ID   7 | col -1 | v  0.0000 | L -1 | R -1 | y 0.6952 | leaf T
ID   8 | col -1 | v  0.0000 | L -1 | R -1 | y 0.5598 | leaf T
ID  11 | col -1 | v  0.0000 | L -1 | R -1 | y 0.4101 | leaf T
ID  12 | col -1 | v  0.0000 | L -1 | R -1 | y 0.2613 | leaf T
ID  13 | col -1 | v  0.0000 | L -1 | R -1 | y 0.1882 | leaf T
ID  14 | col -1 | v  0.0000 | L -1 | R -1 | y 0.1381 | leaf T

Evaluating tree model

Train acc (within 0.10) = 0.3750
Test acc (within 0.10) = 0.4750

Train MSE = 0.0048
Test MSE = 0.0054

Predicting for x =
-0.1660   0.4406  -0.9998  -0.3953  -0.7065
y = 0.4101

IF
column 0  gt   -0.2102 AND
column 0  lte   0.3915 AND
column 4  lte  -0.2987 AND
THEN
predicted = 0.4101

End decision tree regression demo

The following diagram shows a visual representation of the demo tree and shows how a prediction is made. The accuracy and MSE (mean squared error) are weak because maxDepth is too small and minLeaf is too large (to keep the tree small enough for easy visualization).

The biggest disadvantage of a simple decision tree regression system is that a single tree usually overfits the training data. In overfitting, the trained model predicts well on the training data, but predicts poorly on new, previously unseen data. Because of the overfitting problem, decision trees are rarely used by themselves. Instead, multiple trees are combined into an ensemble model such as bagging tree regression, random forest regression, adaptive boosting regression, or gradient boosting regression. The decision tree design presented in the article is well-suited for use in ensemble models.

One of the best things about the Internet is that you can find information about things that happened years ago, where you only have vague recollections. A common example is old childhood toys or old movies seen on TV.

Years ago, I saw a TV show about dinosaurs. I only had the vaguest memory of a shark-like dinosaur trapping some other dinosaur on a tiny atoll. I always wondered about that show and it stuck in my memory for over 20 years. I searched once or twice a year without luck. But recently, a Google search finally turned up the source of the memory.

A TV show called “Walking with Beasts” had an episode titled “Whale Killer” (2001). In it a Basilosaurus (an ancient whale species) hunts a Moeritherium (an early elephant). Now my brain can rest, at least as far as that memory goes.