“Kernel Ridge Regression with Stochastic Gradient Descent Training Using JavaScript” in Visual Studio Magazine

I wrote an article titled “Kernel Ridge Regression with Stochastic Gradient Descent Training Using JavaScript” in the September 2025 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2025/09/02/kernel-ridge-regression-with-stochastic-gradient-descent-training-using-javascript.aspx.

The goal of a machine learning regression problem is to predict a single numeric value. For example, you might want to predict a person’s bank savings account balance based on their age, years of work experience, and so on.

There are approximately a dozen common regression techniques. These include linear regression (several variations), k-nearest neighbors regression, decision tree regression (several types, such as adaptive boosting and random forest), and neural network regression. Each technique has pros and cons. A technique called kernel ridge regression often produces accurate predictions for complex data. Note: “kernel ridge regression” is very different from the similarly named “ridge regression.”

Kernel ridge regression (KRR) uses a kernel function that computes a measure of similarity between two data items, and a ridge regularization technique to limit model overfitting. Model overfitting occurs when a model predicts well on the training data, but predicts poorly on new, previously unseen data. Ridge regularization is also known as L2 regularization.

My article presents a demo of kernel ridge regression, implemented from scratch, using the JavaScript language. The output of the demo is:

C:\JavaScript\KernelRidgeRegressionSGD: node krr_sgd.js

Begin Kernel Ridge Regression with SGD training

Loading train (200) and test (40) from file

First three train X:
 -0.1660   0.4406  -0.9998  -0.3953  -0.7065
  0.0776  -0.1616   0.3704  -0.5911   0.7562
 -0.9452   0.3409  -0.1654   0.1174  -0.7192

First three train y:
   0.4840
   0.1568
   0.8054

Setting RBF gamma = 0.3
Setting alpha decay = 1.0e-5

Setting SGD lrnRate = 0.050
Setting SGD maxEpochs = 2000

Creating and training KRR model using SGD
epoch =      0  MSE = 0.0256  acc = 0.0950
epoch =    400  MSE = 0.0001  acc = 0.9850
epoch =    800  MSE = 0.0001  acc = 0.9850
epoch =   1200  MSE = 0.0000  acc = 0.9950
epoch =   1600  MSE = 0.0000  acc = 0.9900
Done

Model weights:
 -1.3382  -0.6968  -0.1734  -0.6945  . . .
  . . .   -0.7922  -0.1941   0.0498  1.1272

Computing model accuracy

Train acc (within 0.10) = 0.9950
Test acc (within 0.10) = 0.9750

Train MSE = 0.0000
Test MSE = 0.0002

Predicting for x =
  -0.1660    0.4406   -0.9998   -0.3953   -0.7065
Predicted y = 0.4948

End demo

C:\JavaScript\KernelRidgeRegressionSGD:

The end of the article gives a summary:

* Kernel ridge regression (KRR) is a machine learning
  technique to predict a numeric value.
* Kernel ridge regression requires a kernel function
  that measures the similarity of two vectors.
* The most common kernel function is the radial basis
  function (RBF).
* There are two forms of the RBF function, the gamma
  and the sigma.
* There are two ways to train a KRR model, kernel matrix
  inverse and stochastic gradient descent (SGD).
* Both training techniques require an alpha constant for
  ridge (aka L2) regularization to discourage model
  overfitting.
* The matrix inverse training technique often works well
  for small and medium size datasets, but it is complex
  and can fail.
* The SGD training technique can be used with any size
  dataset, but it requires a learning rate and a maximum
  epochs, which must be determined by trial and error.

Every machine learning regression technique has pros and cons. But when kernel ridge regression works, it is often highly effective.



The development of kernel ridge regression is an important part of the development and evolution of machine learning. I like to think about all types of historical evolution, even art.

When I was a young teen, I loved the Tom Swift Jr. adventures. Here are three of my favorite titles that show how the style of the cover art evolved.

Left: “Tom Swift and His Diving Seacopter” (#7, 1956), art by Graham Kaye. My absolute favorite book in the series.

Center: “Tom Swift and the Electronic Hydrolung (#18, 1961), art by Charles Brey. One of the very best covers.

Right: “Tom Swift and His Dyna-4 Capsule” (#31, 1969), art by Ray Johnson. The art begins to look less juvenile and more like adult science fiction novel covers.


This entry was posted in JavaScript, Machine Learning. Bookmark the permalink.

Leave a Reply