Self-Stabilized Deep Neural Networks

The field of machine learning has all kinds of techniques and algorithms that are very good but for non-technical reasons, don’t get used very often. One example is a self-stabilized neural network. The goal of a self-stabilized NN is to speed up training.

The idea of a self-stabilized neural network is simple in principle but very difficult to explain. I’ll try. I’ll “abuse notation” as the saying goes, to keep things simple.

A regular neural network computes node values sort of like this:

y = tanh(Wx + b)

where W is a set of weights, x is a set of inputs, b is a set of bias values, and tanh is called the activation function. With this scheme, when you train a standard neural network, you repeatedly update each weight and bias value using the Calculus gradient. For the weights, the update equation sort of looks like:

w = w + (-1 * learn_rate * grad_w * x)

There’s a similar equation to update each bias value.

A self-stabilized NN adds an additional parameter, B, to each hidden layer of a deep neural network. The node computation now looks like:

y = tanh(exp(B) * Wx + b)

This leads to a new update equation for each w, each bias, and an additional update equation for the B parameter:

w = w + (-1 * learn_rate * exp(B) * grad_w * x)

B = B + (-1 * learn_rate * grad_B * x)

Research results suggest that this self-stabilizing technique does in fact speed up training.

But I almost never see this technique used. I suspect that there are so many techniques like this that people just can’t keep up with the new developments. My hunch is that for something new to grab the attention of the deep learning community at large and be adopted, it has to have a big, big effect, rather than being an incremental improvement.

See the excellent research paper at: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/SelfLR.pdf

See the nice poster at: https://sigport.org/sites/default/files/selflr_icassp_poster.pdf

This entry was posted in Machine Learning. Bookmark the permalink.