Displaying CIFAR-10 Images Using PyTorch

I was reviewing a machine learning research paper recently. The paper described a technique for automatically creating machine learning models. The experiments in the paper performed binary classification on images in the CIFAR-10 benchmark dataset.

I hadn’t looked at CIFAR-10 for a few months, so I decided to refresh my memory by writing code to display CIFAR-10 images. There are a total of 60,000 CIFAR-10 images divided into 6,000 each of 10 (hence the “10” in “CIFAR-10”) different objects: ‘plane’, ‘car’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’. There is also a CIFAR-100 dataset that has 100 different items.

Each CIFAR-10 image is a relatively small 32 x 32 pixels in size. The images are in color so each pixel has three values for the red, green, and blue channel values. Therefore, each image has a total of 32 * 32 * 3 = 3072 values.

It is possible to read the raw CIFAR-10 values into memory, then rearrange them into a 3-d matrix and display them using the Python matplot library. The process is a bit tricky. However, while reviewing the PyTorch library documentation I discovered that PyTorch has a library called torchvision that has useful functions that make displaying CIFAR-10 images very easy.

I used the documentation examples to write a short Python program that loads the first 100 training images into memory, then iterates through those 100 images and displays just the frog images, one at a time.

Because 32 x 32 pixels is such low resolution, each CIFAR-10 image is very crude. Interestingly, if you squint your eyes, each image gets more recognizable.

The demo code does some unnecessary work in the sense that the image values are normalized and turned into PyTorch tensor values. This is useful when created a neural network classification model, but isn’t needed to display images. But I left the normalize and unnormalize code in anyway.

Good fun.

Left: Mr. Toad from the “Wind in the Willows” segment of Disney’s “The Adventures of Ichabod and Mr. Toad” (1949). Center: Neville Longbottom (played by actor Matthew Lewis) and his often-misplaced toad Trevor (played by an unknown toad) from the Harry Potter series of movies. Right: Buford, a desert toad who is the bartender of the Gas Can Saloon in “Rango” (2011).

# show_cifar.py
# Python 3.7.6  PyTorch 1.6.0  TorchVision 0.7.0

import torch as T
import torchvision as tv
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
  img = img / 2 + 0.5   # unnormalize
  npimg = img.numpy()   # convert from tensor
  plt.imshow(np.transpose(npimg, (1, 2, 0))) 
  plt.show()

def main():
  transform = transforms.Compose( [transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5),
      (0.5, 0.5, 0.5))])

  trainset = tv.datasets.CIFAR10(root='.\\data', train=True,
    download=False, transform=transform)
  trainloader = T.utils.data.DataLoader(trainset,
    batch_size=100, shuffle=False, num_workers=1)

  # classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog',
  #   'frog', 'horse', 'ship', 'truck')

  # get first 100 training images
  dataiter = iter(trainloader)
  imgs, lbls = dataiter.next()

  for i in range(100):  # show just the frogs
    if lbls[i] == 6:  # 6 = frog
      imshow(tv.utils.make_grid(imgs[i]))

if __name__ == "__main__":
  main()