A Simple Example of PyTorch Gradients

When you define a neural network in PyTorch, each weight and bias gets a gradient. The gradient values are computed automatically (“autograd”) and then used to adjust the values of the weights and biases during training.

In the early days of PyTorch, you had to manipulate gradients yourself. High level abstractions like the Module() class and the Dense() class take care of all the manipulation now (in most cases), but it’s still interesting to see an example of what is happening behind the scenes.

Suppose you have some math function f(x) = y = x^2 + 3x + 1. The value of f(4) is 4^2 + 3*4 + 1 = 29. The Calculus derivative of f(x) is f'(x) = dy/dx = 2*x + 3 and so the value of the derivative at 4 is f'(4) = 2*4 + 3 = 11. The Calculus derivative is the same (almost) as the gradient.

I wrote a tiny demo to illustrate this example.

# gradient_demo.py

import torch as T
device = T.device("cpu")

def some_func(x):
  result = (x * x) + (3 * x) + 1
  return result

def main():
  print("\nBegin demo \n")

  x = T.tensor([4.0], dtype=T.float32,
    requires_grad=True).to(device)
  y = some_func(x)

  print("x = " + str(x))
  print("y = " + str(y))
  print("")

  df = y.grad_fn
  print("df = " + str(df))
  print("")

  y.backward()  # compute grad of some_func(4)
  print("gradient of func(x) = ")
  print(x.grad) # 2(4) + 3 = 11

if __name__ == "__main__":
  main()

The code is short but very dense in terms of ideas. But if you walk through each line, the demo should (eventually) make sense. The demo works with single values, such as [4], but in all non-demo scenarios, you’d be working with tensors with several values, such as [2.1, 5.4, 3.2] — but the principles are the same.

Let me emphasize that understanding how to directly manipulate PyTorch gradients isn’t necessary if you do standard things like neural networks. You only need to work with gradients at a low level if you are creating some sort of custom system. I’ve noticed this causes confusion for beginners because many introductory PyTorch tutorials go into a fair amount of detail explaining gradients, but then that information is never used to create a neural network.



Three photographs with color gradients.

This entry was posted in PyTorch. Bookmark the permalink.

1 Response to A Simple Example of PyTorch Gradients

  1. Thorsten Kleppe's avatar Thorsten Kleppe says:

    Excellent explanation, really great background with the pictures. I don’t know exactly how you can do it, but a great opening in 2021. 

    Meanwhile, my understanding of gradients feels like an approximation that goes one step closer to the target with each explanation.

    When Climbing, or more precisely Bouldering, it helps immensely to be with good climbers. Every single route represents a problem, from very easy to impossible. But even if the route seems impossible, if you sharpen your skills enough and watch the good ones closely as they try, you will be able to surprise yourself.

    Compared to climbing, your efforts are like those of an Alex Megos. At the beginning of climbing, there were routes that were impossible to climb, but with people like Alex, this space expanded in all directions.

Comments are closed.