Fetching CIFAR-10 Data and Saving as a Text File

The CIFAR-10 image dataset is often used for machine learning image recognition experiments. The dataset has 60,000 crude color images (50,000 for training and 10,000 for testing). Each image is 32×32 pixels. Each RGB pixel is a value between 0 and 255. The images are 10 classes 0 to 9: plane, car, bird, cat, deer, dog, frog, horse, ship, truck.

The PyTorch, scikit, and Keras neural network libraries have built-in CIFAR-10 datasets. But I don’t like to use built-in datasets because data access becomes a magic black box, which hides important information that’s needed if you ever want to work with real data.

The first CIFAR-10 image is a creepy jungle frog with red eyes.

I wrote a demo that fetches source CIFAR-10 image data in binary format, extracts it into memory, then writes the data to a text file.

The source data is at cs.toronto.edu/~kriz/cifar.html. There are three data format versions: Python, Matlab, raw binary. The Python version is easiest to deal with. When you click on the link, you’ll download a gzipped file named cifar-10-python.tar.gz. If you unzip the file (using the 7-Zip utility program), you get a directory named cifar-10-batches.py (horrible name — it’s a data directory not a Python program). Inside that directory are 6 binary files named data_batch_1, data_batch_2, data_batch_3, data_batch_4, data_batch_5, test_batch. Each file holds 10,000 images in Python “pickle” format.

My demo program reads the 10,000 images in data_batch_1 into a Python dictionary object, then pulls the first 3 images out, and saves them to a text file as cifar10_3.txt. After the text file is saved, to verify everything worked as expected, the demo reads the three images from the saved file, and displays the first image (class = 9, a frog).

There are a lot of details in the demo program. The images are stored one image per line in the text file. Each image has 3072 values (1024 RBG values for each pixel). The class label, 0 to 9, is the last value on each line.

Three of my favorite science fiction movies that feature frogs. Left: “Frogs” (1972) tells a story about Nature striking back against pollution. There are no giant frogs as the movie poster suggests there are. Center: “Love and Monsters” (2020) tells a story about what happens when an asteroid releases chemical fallout — giant mutated creatures including a nasty frog. Right: “The Maze” (1953) is a rather obscure movie that takes place in a gothic mansion that has a large hedge maze. It’s not a good idea to go wandering around in that maze at night.

Demo program code:

# unpickle_cifar10.py

import numpy as np
import pickle
import matplotlib.pyplot as plt

print("\nBegin demo ")

print("\nLoading first 10,000 CIFAR-10 images into dict ")
file = ".\\cifar-10-batches-py\\data_batch_1"
with open(file, 'rb') as fin:
  dict = pickle.load(fin, encoding='bytes')

n_images = 3
print("\nWriting first " + str(n_images) + \
  " images to text file ")
# keys: b'batch_label' b'labels' b'data' b'filenames'

labels = dict[b'labels']  # 10,000 labels
pixels = dict[b'data']    # 10,000 rows 3,072 pixels

fn = ".\\cifar10_3.txt"
fout = open(fn, 'w', encoding='utf-8')
for i in range (3):      # n images
  for j in range(3072):  # pixels
    val = pixels[i][j]
    fout.write(str(val) + ",")
  fout.write(str(labels[i]) + "\n")
fout.close()
print("Done ")

print("\nDisplaying first image in saved file: ")
data = np.loadtxt(fn, delimiter=",",
  usecols=range(0,3072), dtype=np.int64)

# quick but rather tricky appraoch
# img = data[0].reshape(3,32,32).transpose([1, 2, 0])  

# more clear I think
pxls_R = data[0][0:1024].reshape(32,32)  # not last val
pxls_G = data[0][1024:2048].reshape(32,32)
pxls_B = data[0][2048:3072].reshape(32,32)

img = np.dstack((pxls_R, pxls_G, pxls_B))
plt.imshow(img)
plt.show()

print("\nEnd demo ")