Converting Fashion-MNIST Binary Files to Text Files

The MNIST (Modified National institute of Standards and Technology) dataset contains images of handwritten digits from ‘0’ to ‘9’. Each image is 28 by 28 pixels and each pixel is a grayscale value between 0 and 255.

The MNIST data is stored in a custom binary format that’s not directly usable. Furthermore, the pixel values and class label values are stored in two different files. There are 60,000 MNIST training images and 10,000 test images.

I always convert MNIST from their raw binary format into train and test text files. I usually put each image on a line where the first 784 values are the pixels and the last value is the digit/label. See jamesmccaffrey.wordpress.com/2022/01/21/working-with-mnist-data/.

The main problem with the MNIST dataset is that it’s too easy to create a good classifier. You can easily get a model with well over 99% accuracy.

The Fashion-MNIST was designed to be a drop-in replacement for MNIST. Fashion-MNIST is identical to MNIST except that each image is one of ten pieces of clothing:

0   T-shirt/top
1   Trouser
2   Pullover
3   Dress
4   Coat
5   Sandal
6   Shirt
7   Sneaker
8   Bag
9   Ankle boot

Fashion-MNIST is at github.com/zalandoresearch/fashion-mnist. I decided to see if my utility program that converts MNIST binary to text could be adapted to do the same for Fashion-MNIST. Bottom line: yes, creating Fashion-MNIST text files is almost exactly the same as creating MNIST text files.

Briefly:

1. manually download four gzipped-binary files from
   github.com/zalandoresearch/fashion-mnist/tree
   /master/data/fashion 
2. use 7-Zip to unzip files, add ".bin" extension
3. determine format you want and modify script
4. run the script

The script works like this:

open pixels binary file for reading
open labels binary file for reading
open destination file for writing
read and discard file header info

set number images wanted
loop number images times
  get label value from label file
  convert label from binary to text
  loop 784 times
    get pixel value from pixels file
    convert from binary to text
    write pixel value to file
  end-loop
  write label value
  write newline
end-loop

I intend to use the Fashion-MNIST dataset to do some experiments with warm-start training: train an MNIST classifier model from scratch, put the MNIST model weights in an empty Fashion-MNIST classifier, train the Fashion-MNIST model (warm-start), and see if the resulting classifier is belter than if the classifier had been trained from scratch.



There is a long and fascinating history of computer science related to chess. There is not a long history of fashion related to computer science. Here are three examples of fashion inspired by chess.


Code. Replace “lt”, “gt”, “lte”, “gte” with Boolean operator symbols.

# converter_f-mnist.py
# Anaconda3-2020.02 - Python 3.7.6

import numpy as np
import matplotlib.pyplot as plt

# convert Fashion MNIST binary to text file; 
# combine pixels and labels
# target format:
# pixel_1 (tab) pixel_2 (tab) . . pixel_784 (tab) digit

# 0 = T-shirt/top, 1 = Trouser, 2 = Pullover
# 3 = Dress, 4 = Coat, 5 = Sandal, 6 = Shirt
# 7 = Sneaker, 8 = Bag, 9 = Ankle boot.

# 1. manually download four gzipped-binary files from
# github.com/zalandoresearch/fashion-mnist/tree
#   /master/data/fashion 
# 2. use 7-Zip to unzip files, add ".bin" extension
# 3. determine format you want and modify script

def convert(img_file, label_file, txt_file, n_images):
  print("\nOpening binary pixels and labels files ")
  lbl_f = open(label_file, "rb")   # F-MNIST has labels
  img_f = open(img_file, "rb")     # and pixel vals separate
  print("Opening destination text file ")
  txt_f = open(txt_file, "w")      # output file to write to

  print("Discarding binary pixel and label files headers ")
  img_f.read(16)   # discard header info
  lbl_f.read(8)    # discard header info

  print("\nReading binary files, writing to text file ")
  print("Format: 784 pixel vals then label val, tab delimited ")
  for i in range(n_images):   # number images requested 
    lbl = ord(lbl_f.read(1))  # get label (unicode, one byte) 
    for j in range(784):  # get 784 vals from the image file
      val = ord(img_f.read(1))
      txt_f.write(str(val) + "\t") 
    txt_f.write(str(lbl) + "\n")
  img_f.close(); txt_f.close(); lbl_f.close()
  print("\nDone ")

def display_from_file(txt_file, idx):
  all_data = np.loadtxt(txt_file, delimiter="\t",
    usecols=range(0,785), dtype=np.int64)

  x_data = all_data[:,0:784]  # all rows, 784 cols
  y_data = all_data[:,784]    # all rows, last col

  label = y_data[idx]
  print("label = ", str(label), "\n")

  pixels = x_data[idx]
  pixels = pixels.reshape((28,28))
  for i in range(28):
    for j in range(28):
      # print("%.2X" % pixels[i,j], end="")
      print("%3d" % pixels[i,j], end="")
      print(" ", end="")
    print("")

  plt.tight_layout()
  plt.imshow(pixels, cmap=plt.get_cmap('gray_r'))
  plt.show()  

# -----------------------------------------------------------

# -----------------------------------------------------------

def main():
  n_images = 1000
  print("\nCreating %d F-MNIST train images from binary files "\
   % n_images)
  convert(".\\UnzippedBinary\\train-images-idx3-ubyte.bin",
          ".\\UnzippedBinary\\train-labels-idx1-ubyte.bin",
          "f-mnist_train_1000.txt", 1000)

  n_images = 100
  print("\nCreating %d F-MNIST test images from binary files " %\
    n_images)
  convert(".\\UnzippedBinary\\t10k-images-idx3-ubyte.bin",
          ".\\UnzippedBinary\\t10k-labels-idx1-ubyte.bin",
          "f-mnist_test_100.txt", 100)

  print("\nShowing train image [0]: ")
  img_file = ".\\f-mnist_train_1000.txt"
  display_from_file(img_file, idx=0)  # first image
  
if __name__ == "__main__":
  main()
This entry was posted in PyTorch. Bookmark the permalink.