I wrote an article titled “Neural Network Regression from Scratch Using C#” in the October 2023 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2023/10/18/neural-network-regression.aspx.
The goal of a machine learning regression problem is to predict a single numeric value. For example, you might want to predict the annual income of a person based on their sex (male or female), age, State of residence and political leaning (conservative, moderate, liberal).
There are roughly a dozen major regression techniques, and each technique has several variations. Among the most common techniques are linear regression, linear ridge regression, k-nearest neighbors regression, kernel ridge regression, Gaussian process regression, decision tree regression and neural network regression. My article gives a complete end-to-end demo of neural network regression from scratch, using the C# language.
The demo uses one of my standard synthetic datasets that looks like:
0, 0.24, 1,0,0, 0.2950, 0,0,1 1, 0.39, 0,0,1, 0.5120, 0,1,0 0, 0.63, 0,1,0, 0.7580, 1,0,0 . . .
The fields are sex (0 = male, 1 = female), age (divided by 100), State (100 = Michigan, 010 = Nebraska, 001 = Oklahoma), income (divided by $100,000) and political leaning (100 = conservative, 010 = moderate, 001 = liberal). The goal is to predict income from the other four variables.
The demo neural network has architecture 8-100-1 with tanh hidden activation, identity output activation, and uniform [-0.01, +0.01) weight initialization. The network is trained using basic SGD optimization for 2,000 epochs with a batch size of 10, and a constant learning rate of 0.01.

The article explains NN IO using a small 3-4-1 network
The demo neural network uses a single hidden layer. It is possible to extend the demo network architecture to multiple hidden layers, but this would require a huge effort. Theoretically, a neural network with a single hidden layer and enough hidden nodes can compute anything that a neural network with multiple hidden layers can compute. This fact comes from what is called the Universal Approximation Theorem.

“Haute couture” is French, meaning “high dressmaking” — exclusive from-scratch fashion. A lot of haute couture is absurd, ugly, and impractical. But some are nice, like the three shown here. I like to think that I sometimes create “haute systems” (not really). Left: By designer Iris van Herpen. Center: By Ashi Studio. Right: By Julien Fournie.


.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.