Multi-Class Classification Using PyTorch: Training

I wrote an article titled “Multi-Class Classification Using PyTorch: Training” in the January 2021 edition of the online Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2021/01/04/pytorch-training.aspx.

The article is the third in a four-part series that presents a complete end-to-end demo of a multi-class classification problem. In the series I cover data preparation, creating Dataset and DataLoader objects to serve up the data, neural network design and code implementation, training, evaluating model accuracy, checkpoints and saving models, and using models.

The recurring example in the series is predicting a college student’s major (finance, geology or history) from sex, units completed, home state (maryland, nebraska or oklahoma), and score on an admissions test. My January article covers the training part of the system.

In high-level pseudo-code, training a neural network looks like:

loop max_epochs times
  loop until all batches processed
    read a batch of training data (inputs, targets)
    compute outputs using the inputs
    compute error between outputs and targets
    use error to update weights and biases
  end-loop (all batches)
end-loop (all epochs)

Training with PyTorch is paradoxically complex and simple. Because PyTorch works at a relatively high-level, the actual code is relatively short. But each code statement is extremely deep conceptually.

One of the things that stalls newcomers to PyTorch is training optimizers. PyTorch 1.7 has 11 optimizers: SGD, Adam, etc. Each optimizer has many parameters (up to about 8, not including inherited parameters), and the parameters are complicated. My advice in the article is to not try to immediately learn every detail of every training optimizer. In my opinion the best approach is to focus on SGD and Adam, and slowly learn about the more specialized optimizers as your experience increases. An analogy is learning woodworking — you want to learn how to use handsaws and a drill press before looking at specialized tools like a dado machine.


My cousin, Peter Nunes, is a famous restorer of antique clocks, especially those with wooden movements. Left: Peter restored the clock in the Republican-American tower in New Haven, Connecticut. The tower was built in 1909 and the clock is 16 feet in diameter. Center: Peter standing next to an antique clock that he restored. Right: An Eli Terry clock with wooden movements. Eli Terry (1772-1852) was a clockmaker who introduced mass production to clockmaking, making clocks affordable for average citizens.

This entry was posted in PyTorch. Bookmark the permalink.