How To Code Modern Neural Networks using Python and NumPy

I wrote an article titled “How To Code Modern Neural Networks using Python and NumPy” in the July 2019 issue of Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2019/07/01/modern-neural-networks.aspx.

The idea of the article is to explain several relatively small but significant changes in the default techniques used in neural networks. Many of these changes have occurred since about 2014. For example, in the early days of neural networks, logistic sigmoid activation was by far the most common activation function for hidden layer neurons. Then starting about 8 years ago, tanh activation became the standard hidden layer activation function. And then starting around 2015, ReLU (rectified linear unit) activation became the standard activation function because it’s less susceptible to the vanishing gradient problem.

Other changes in standard techniques include:

* initializing weights using the Glorot technique instead of uniform random.

* using mini-batch training instead of batch or online.

* using Adam optimization instead of stochastic gradient descent (SGD).

* using inverted dropout instead of the original technique.

* using cross entropy loss instead of squared error loss.

In my article I explain exactly how Glorot initialization and ReLU activation work. I do so by presenting raw Python code (meaning no external library functions, such as Keras or PyTorch).

Simple, single hidden layer neural networks are relatively easy to work with. But deep neural networks — classifiers, regression, convolutional, and recurrent — require quite a bit of study to learn about nuances such as the difference between ReLU, leaky ReLU, and tanh activation. When using a neural network code library like Keras, you don’t need to know how to implement a function like ReLU but you do need to know when to use it and its pros and cons compared to alternatives.