Preparing CIFAR Image Data for PyTorch in Visual Studio Magazine

I wrote an article titled “Preparing CIFAR Image Data for PyTorch” in the April 2022 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2022/04/01/preparing-cifar-data.aspx.

A common dataset for image classification experiments is CIFAR-10. The goal of a CIFAR-10 problem is to analyze a crude 32 x 32 color image and predict which of 10 classes the image is. The 10 classes are plane, car, bird, cat, deer, dog, frog, horse, ship and truck.

The CIFAR-10 (Canadian Institute for Advanced Research, 10 classes) data has 50,000 images intended for training and 10,000 images for testing. The article explains how to get the raw source CIFAR-10 data, convert the data from binary to text and save the data as a text file that can be used to train a PyTorch neural network classifier.

Most popular neural network libraries, including PyTorch, scikit and Keras, have some form of built-in CIFAR-10 dataset designed to work with the library. But there are two problems with using a built-in dataset. First, data access becomes a magic black box and important information is hidden. Second, the built-in datasets use all 50,000 training and 10,000 test images and these are difficult to work with because they’re so large.

The cifar-10-batches-py source directory contains six binary files that have names with no file extension: data_batch_1, data_batch_2, data_batch_3, data_batch_4, data_batch_5 and test_batch. Each of these files contains 10,000 images in Python “pickle” binary format.

Each image is 32 x 32 pixels. Because the images are in color, there are three channels (red, green, blue). Each channel-pixel value is an integer between 0 and 255. Therefore, each image is represented by 32 * 32 * 3 = 3,072 values between 0 and 255.

To convert the CIFAR-10 images from binary pickle format to text, you need to write a short Python language program. My article presents such a program and explains how to modify it to suit any scenario. After unpickling the source data, the key lines of code are:

fn = ".\\cifar10_train_5000.txt"  # file to save to
fout = open(fn, 'w', encoding='utf-8')
for i in range (n_images):      # n images
  for j in range(3072):  # write the pixels
    val = pixels[i][j]
    fout.write(str(val) + ",")
  fout.write(str(labels[i]) + "\n")  # write the label
fout.close()

I don’t always enjoy working with raw data — I have more fun with algorithms. But the CIFAR-10 data is kind of interesting.

CIFAR images are for research but I enjoy looking at pulp science fiction novel cover images just for fun. Left: By artist Jack Gaughan. Center: By artist Gene Szafran. Right: By artist Richard Powers.