Suppose you are using a neural network to make prediction where the thing-to-predict can be one of three possible values. For example, you might want to predict the political party affiliation of a person (democrat, republican, other) based on things like age, annual income, sex, and years of education.
A neural network classifier would accept four numeric inputs corresponding to age, income, sex, education and then generate a preliminary output of three values like (1.55, 2.30, 0.90) but then normalize the preliminary outputs so that they sum to 1.0 and can be interpreted as probabilities.
By far the most common normalizing function is called Softmax:
exp(1.55) = 4.71 exp(2.30) = 9.97 exp(0.90) = 2.46 sum = 17.15 softmax(1.55) = 4.71 / 17.15 = 0.28 softmax(2.30) = 9.97 / 17.15 = 0.58 softmax(0.90) = 2.46 / 17.15 = 0.14
If you are using the back-propagation algorithm for training, then you need to use the Calculus derivative of the Softmax function, which is softmax'(x) = x * (1-x).
I’d always wondered if there were alternatives to the Softmax function. I tracked down a rather obscure research paper published in 2016 that explored something called the Taylor Softmax function. The Taylor Softmax for the example values above is:
taylor(1.55) = 1.0 + 1.55 + 0.5 * (1.55)^2
= 3.75
taylor(2.30) = 1.0 + 2.30 + 0.5 * (2.30)^2
= 5.96
taylor(0.90) = 1.0 + 0.90 + 0.5 * (0.90)^2
= 2.31
sum = 12.00
taylor-soft(1.55) = 3.75 / 12.00 = 0.31
taylor-soft(2.30) = 5.96 / 12.00 = 0.50
taylor-soft(0.90) = 0.90 / 12.00 = 0.19
The Calculus derivative of the Taylor Softmax is rather ugly:
I coded up a demo program to compare regular Softmax with the Taylor Softmax. My non-definitive mini-exploration showed the regular Softmax worked much better.
My conclusion: Almost everything related to neural networks is a bit tricky. The Taylor Softmax activation function may be worth additional investigation, but my micro-research example leaves me a bit skeptical about the usefulness of Taylor Softmax.


.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.