"AdaBoost.R2 Regression Using C#" in Visual Studio Magazine

I wrote an article titled “AdaBoost.R2 Regression Using C#” in the June 2026 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2026/06/01/adaboost-r2-regression-using-csharp.aspx.

The goal of a machine learning regression problem is to predict a single numeric value. For example, a bank might want to predict the maximum safe loan amount for a customer based on age, account balance, current debt, and so on. Regression techniques fall into two categories: those based on decision tree structures, and those that use non-tree techniques.

In a nutshell:

* AdaBoost.R2 regression is a machine learning technique used to predict a single numeric value
* AdaBoost.R2 builds a sequence of decision tree regressors where each accepted tree improves prediction compared to earlier trees.
* Final AdaBoost.R2 predictions are produced as a weighted median across the decision tree predictions.

Examples of non-tree techniques include linear regression, quadratic regression, kernel ridge regression, nearest neighbors regression, and neural network regression. The three most common tree-based regression techniques are random forest regression (which includes a variant called bagging tree regression), AdaBoost.R2 regression, and gradient boost regression. Each regression technique has pros and cons. If there was one regression technique that worked best for all types of datasets, there would only be one technique, not many.

My article presents a complete demo of AdaBoost.R2 implemented using the C# language. AdaBoost.R2 regression sequentially creates an ensemble (collection) of simple decision trees, where each tree is a bit better at prediction than the previous tree. For a given input x, the predicted y value is the weighted median of the predictions of the collection of decision trees.

The output of the article demo program is:

Begin AdaBoost.R2 (tree) regression from scratch C#

Loading synthetic train (200) and test (40) data

First three train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

First three train y:
  0.4840
  0.1568
  0.8054

Setting maxLearners = 100
Setting tree maxDepth = 5
Setting tree minSamples = 2
Setting tree minLeaf = 1

Training AdaBoost.R2 model
Done
Created 100 learners

Accuracy train (within 0.10): 0.8250
Accuracy test (within 0.10): 0.5250

MSE train: 0.0004
MSE test: 0.0022

Predicting for x =
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
Predicted y = 0.4890

End demo

I based my demo program directly on the 1997 source research paper, “Improving Regressors Using Boosting Techniques” (1997), by H. Drucker.

A weight value is assigned to each training data item, where the weight values sum to 1. Initially all training item weights are the same. For the demo data, because there are 200 items, each initial data item weight is 1/200 = 0.005.

Each training data item is fed to the current decision tree and an average loss/error for the current tree is computed. The average loss for the current tree is used to compute a beta value for the tree. A small value for beta means high confidence in the current tree, and vice versa.

The weights associated with each training data item are updated using the beta value for the current tree and the loss/error for each data item. At this point, a new set of training data items is probabilistically created from the source data items, where data items with high error have a greater probability of being selected than data items with small error.

The new training data items are fed to the next decision tree. Because the new training dataset has high-error items, the new tree will concentrate on learning to predict those difficult-to-predict items.

Each new training dataset is constructed probabilistically, and so there is a chance that a newly created decision tree is not better than the previous decision tree. In this case, the newly created decision tree is not added to the collection of trees. This is why the actual number of trees created may be less than the maxLearners parameter.

AdaBoost.R2 regression is not used nearly as often as random forest regression, or gradient boost regression. My AdaBoost.R2 demo is therefore most likely to be used with legacy systems that already use AdaBoost.R2 regression.

Like most of my friends, I get obsessed with all kinds of strange things. I have been obsessed with machine learning for many years.

For reasons that are unknown to me, I am moderately obsessed with movies that feature a creature that makes noises that are translated to “chittering” in the closed caption text. The are an amazing number of chittering instances in movies, but you’ll usually only notice them if you’re specifically looking for them.

Left: In the fantasy movie “The Dark Crystal” (192), Gelfling Jen and his girlfriend Kira, must overthrow the evil, ruling Skeksis by restoring a powerful broken Crystal. Early in the movie, Jen travels throw a creepy swamp with a chittering creature.

Right: In the fantasy movie “The Legend of Ochi” (2025), farm girl Yuri discovers a wounded baby primate-like creature and works to return it to its family. She is menaced by a hunting party led by her father and her stepbrother. When Yuri goes through a cave system, some insects are chittering while a bat watches.