“Linear Regression with Two-Way Interactions Using C#” in Visual Studio Magazine

I wrote an article titled “Linear Regression with Two-Way Interactions Using C#” in the May 2025 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2025/05/02/linear-regression-with-two-way-interactions-using-c.aspx.

The most basic machine learning regression (predict a single numeric value) technique is called linear regression, or sometimes multiple linear regression, where the “multiple” indicates two or more predictor variables. The form of a basic linear regression prediction model is y’ = (w0 * x0) + (w1 * x1) + . . . + (wn * xn) + b, where y’ is the predicted value, the xi are predictor values, the wi are weights (also called coefficients), and b is the bias (also called the intercept).

The form of a linear regression with two-way interactions model is y’ = (w0 * x0) + . . . + (wn * xn) + (w01 * x0 * x1) + (w02 * x0 * x2) + . . . + b. The interaction terms are the products of all combinations of pairs of the predictor variables.

Compared to basic linear regression, linear regression with interactions can handle more complex data. Linear regression with interactions often has slightly worse prediction accuracy than standard linear regression, but has better model interpretability.

My article presents a demo of linear regression with two-way interactions, implemented from scratch, using the C# language. The demo data is synthetic and looks like:

-0.1660,  0.4406, -0.9998, -0.3953, -0.7065,  0.4840
 0.0776, -0.1616,  0.3704, -0.5911,  0.7562,  0.1568
-0.9452,  0.3409, -0.1654,  0.1174, -0.7192,  0.8054
 0.9365, -0.3732,  0.3846,  0.7528,  0.7892,  0.1345
. . .

The first five values on each line are the x predictors. The last value on each line is the target y variable to predict. There are 200 training items and 40 test items.

The output of the demo program is:

Begin C# linear regression with two-way interactions demo

Loading synthetic train (200) and test (40) data
Done

First three train X:
 -0.1660  0.4406 -0.9998 -0.3953 -0.7065
  0.0776 -0.1616  0.3704 -0.5911  0.7562
 -0.9452  0.3409 -0.1654  0.1174 -0.7192

First three train y:
  0.4840
  0.1568
  0.8054

Creating Linear Regression with two-way interactions model

Setting lrnRate = 0.0010
Setting maxEpochs = 100

Starting training
epoch =     0  MSE =   0.3430
epoch =    20  MSE =   0.0570
epoch =    40  MSE =   0.0344
epoch =    60  MSE =   0.0298
epoch =    80  MSE =   0.0281
Done

Model base weights:
 -0.2624  0.0340 -0.0462  0.0322 -0.1152
Model bias/intercept:   0.3620

Model interaction weights:
  0.0000  0.0000  0.0000  0.0000  0.0000
 -0.0014  0.0000  0.0000  0.0000  0.0000
  0.0321  0.0101  0.0000  0.0000  0.0000
  0.0183 -0.0110  0.0008  0.0000  0.0000
  0.0947  0.0328 -0.0452  0.0024  0.0000

Evaluating model
Accuracy train (within 0.15) = 0.8300
Accuracy test (within 0.15) = 0.8000

Predicting for x =
  -0.1660   0.4406  -0.9998  -0.3953  -0.7065
Predicted y = 0.5094

End demo

There are 10 interaction terms: x0 and x1, x0 and x2, . . x3 and x4 and so the entire model has 5 weights for the regular predictors, plus 10 weights for the interaction term, plus 1 bias.

The demo program uses plain vanilla stochastic gradient descent training, which requires a learning rate and a maximum number of epochs (both values determined by trial and error).

Linear regression with two-way interactions is not always effective — if it were, it would have replaced basic linear regression. Put another way, linear regression with two-way interactions can sometimes provide a big improvement in model quality for a relatively small investment in effort, and so it’s usually worth exploring.



One personality characteristic of most of the guys I work with is attention to detail that borders on obsessiveness. My article on linear regression was instigated by a lecture I had as a college student over 50 years ago. The professor mentioned linear regression with interactions but didn’t have time to explain. The idea of figuring out LR with interactions gnawed away at my brain for the next five decades until I finally resolved it by implementing a system from scratch.

I love old science fiction movies. One that I saw many times on TV as a young man, “The Deadly Mantis” (1957) tells a story about a giant praying mantis. About halfway through the movie, a flight of six U.S. Navy F9F Panther jet fighters race to intercept the beast. I vividly remember the scene above, left, that showed what appeared to be ventral fins on the jets. But I had built scale models of dozens of 1960s and 1960s fighter jets and knew that the Panther did not have ventral fins. What was going on? Are the jets not F9F Panthers? Are the jets some experimental design?

The mystery gnawed away at my brain for over 50 years until, with the help of the Internet, I solved it. It turns out that the F9F has two set of speed brakes, used when dropping bombs. A pair at the rear of the fuselage, that I knew about, but also a rarely used pair of speed brakes at the front. The jets in the movie scene had their front brakes deployed so they wouldn’t zoom past the much slower camera plane. The angle at which the scene was shot made the air brakes appear as ventral fins.

OK, now I can attend to another of the many detail obsessions lurking in my brain


This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply