Researchers Explore Techniques to Reduce the Size of Deep Neural Networks on Pure AI

I contributed to an article titled “Researchers Explore Techniques to Reduce the Size of Deep Neural Networks” in the June 2021 edition of the online Pure AI Web site. See https://pureai.com/articles/2021/06/02/reduce-networks.aspx.

The motivation for reducing the size of deep neural networks is simple: even on the most powerful hardware available, huge models can take weeks or months to train, which uses a lot of electrical energy, which costs a lot of money and generates a lot of CO2 emissions. Reducing the size of a deep neural network is often called compression (in general) or pruning (for specific techniques).

The article describes three recent research papers. “The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks” (2019) by J. Frankle and M. Carbin showed strong evidence that huge neural networks can be pruned to a fraction of their original size, and the pruned model can achieve nearly the same accuracy as the huge source model. The result is mostly theoretical because the paper started with a huge trained model and then pruned it.

“SNIP: Single-Shot Network Pruning Based on Connection Sensitivity” (2020) by N. Lee, T. Ajanthan and P. Torr demonstrated a relatively simple technique to prune a huge neural network before training it.

“Picking Winning Tickets Before Training by Preserving Gradient Flow” (2020) by C. Wang, G. Zhang and R. Grosse is essentially an extension of the SNIP technique to address what the authors felt were some minor weaknesses.

I was quoted in the article. I commented, “There’s a general feeling among my colleagues that huge deep neural models have become so difficult to train that only organizations with large budgets can successfully explore them.” I added, “Deep neural compression techniques that are effective and simple to implement will allow companies and academic institutions that don’t have huge budgets and resources to work with state of the art neural architectures.”

And I said, “Over the past three decades, there has been an ebb and flow as advances in hardware technologies, such as GPU processing, allow deep neural networks with larger sizes. I speculate that at some point in the future, quantum computing will become commonplace, and when that happens, the need for compressing huge neural networks will go away. But until quantum computing arrives, research on neural network compression will continue to be important.”

Three relatively obscure reduced-size comic book super heroes. Left: Tinyman first appeared in “Captain Marvel” #2 (1966). Tinyman was the sidekick for Captain Marvel who could . . get ready . . separate his body parts. Center: The Wasp first appeared in “Tales to Astonish” #44 (1963). She was the sidekick to Ant-Man. Right: Elasti-Girl first appeared in “My Greatest Adventure” #80 (1963). She was an Olympic swimming gold medalist who gained the power to shrink or grow.

2 Responses to Researchers Explore Techniques to Reduce the Size of Deep Neural Networks on Pure AI

Thorsten Kleppe says:

June 21, 2021 at 4:59 am

Very exciting topic. It sounds just a bit too nice when it says “before training”, but the gradients are calculated for the entire data set.

Is there a technique you would favor? Unfortunately, the effort to reduce the network size is relatively large.

Loading...
jamesdmccaffrey says:

June 23, 2021 at 4:34 pm

Yes, “before training” is slightly misleading because you do have to do one backward pass through the network to compute gradients. But this is still much better than having to train using several thousand epochs.

I don’t know enough about all the different ways to have a strong opinion about which is the best technique in practice, but the SNIP technique seems to have a lot of promise. JM

Loading...