For people who are new to neural network libraries such as Keras, CNTK, PyTorch, and TensorFlow, selecting a training algorithm can be a bit confusing. All the libraries support the five main algorithms: stochastic gradient descent (SGD), Adagrad, Adadelta, Adam, and RMSprop. But there are many other algorithms that are variations of the basic five.
Before I go any further, let me summarize rules of thumb for the main five. Use SGD for single hidden layer networks. Don’t use Adagrad except to replicate somebody else’s result. Use Adadelta or RMSprop for recurrent neural networks. Use Adam for general deep neural networks. These rules of thumb are only to get started and every problem requires a lot of experimentation.
Every algorithm has many parameters which makes the totsal number of possibilities for a trainer astronomically large — far too large to try them all. So, training a neural network really is art and science, where the art aspect really means educated guesses based on experience.
I scanned through the libraries’ documentation and put together a chart. One of the problems with libraries is that they often have too much functionality. This is mostly a matter of psychology. If the developers of one library implement a feature, developers of the other libraries feel obliged to do the same.

Four paintings by German artist Carl Spitzweg (1808 – 1885). His work strikes me as a combination of (mostly) art and (some) science.

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.