"Gradient Boosting Regression Using C#" in Visual Studio Magazine

I wrote an article titled “Gradient Boosting Regression Using C#” in the January 2025 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/Articles/2025/01/15/Gradient-Boosting-Regression-Using-CSharp.aspx.

A machine learning gradient boosting regression system, also called a gradient boosting machine (GBM), predicts a single numeric value. A GBM is an ensemble (collection) of simple decision tree regressors that are constructed sequentially to predict the differences (residuals) between predicted y values and actual y values. To make a prediction for an input vector x, an initial prediction is estimated and then predictions are computed by accumulating the predicted residuals from each tree in the ensemble. The running predicted y value will slowly get closer and closer to the true target y value.

My article presents a complete demo of gradient boosting regression using the C# language. Although there are several code libraries that contain implementations of gradient boosting regression, such as XGBoost, LightGBM, and CatBoost, implementing a system from scratch allows you to easily modify the system, and easily integrate with other systems implemented with .NET while generating reasonably interpretable results.

Gradient boosting regression is quite complicated. The demo program shows a bit of what goes on behind the scenes:

Predicting for x =
  -0.1660   0.4406  -0.9998  -0.3953  -0.7065
Initial prediction: 0.3493
t =   0  pred_res =  0.0462  delta = -0.0023  pred =  0.3470
t =   1  pred_res =  0.0427  delta = -0.0021  pred =  0.3449
t =   2  pred_res =  0.0021  delta = -0.0001  pred =  0.3448
t =   3  pred_res =  0.0390  delta = -0.0020  pred =  0.3428
t =   4  pred_res = -0.0124  delta =  0.0006  pred =  0.3434
. . .
t = 198  pred_res =  0.0018  delta = -0.0001  pred =  0.4746
t = 199  pred_res =  0.0000  delta = -0.0000  pred =  0.4746
Predicted y = 0.4746

There are many different variations of gradient boosting regression. The version presented in my article is the simplest possible version and doesn’t have a specific name. An open-source version called XGBoost (“extreme gradient boosting”) is implemented in C++ but has interfaces to most programming languages and the scikit-learn machine learning library. XGBoost is extremely complicated and has over 50 parameters which makes it difficult to tune, and predictions are not easily interpretable.

Another open-source version of gradient boosting regression is called LightGBM. LightGBM is loosely based on XGBoost. In spite of the name, LightGBM is no less complex than XGBoost.

A third open-source version is called CatBoost. Compared to XGBoost, LightGBM, and the from-scratch version of gradient boosting regression presented in this article, CatBoost has built-in support for categorical predictor variables.

Gradient boosting regression converges in small steps towards a predicted value. Most of my favorite science fiction movies feature large, menacing creatures, but a few feature small creatures.

Left: In “Fiend Without a Face” (1958), a scientist on a U.S. air base in Canada experiments with the materialization of thought waves. At first, the thought waves take the form of evil invisible killer brains, and then later take the form of visible brains with attached spinal columns and eyestalks. They are not nice brains.

Center: In “The Tingler” (1959), a scientist discovers that all humans have a small parasite called a “tingler”, which feeds on fear. It grows bigger and stronger when its host is afraid, but weakens if the host screams.

Right: The best known small sci-fi creature is certainly the alien facehugger from “Alien” (1979). The facehugger hatches from a pod, then attaches itself to, and lays an egg in, a host. Things do not end well for the host when the egg hatches.