The MNIST (Modified National institute of Standards and Technology) dataset contains images of handwritten digits from ‘0’ to ‘9’. Each image is 28 by 28 pixels and each pixel is a grayscale value between 0 and 255.
The MNIST data is stored in a custom binary format that’s not directly usable. Furthermore, the pixel values and class label values are stored in two different files. There are 60,000 MNIST training images and 10,000 test images.
I always convert MNIST from their raw binary format into train and test text files. I usually put each image on a line where the first 784 values are the pixels and the last value is the digit/label. See jamesmccaffrey.wordpress.com/2022/01/21/working-with-mnist-data/.
The main problem with the MNIST dataset is that it’s too easy to create a good classifier. You can easily get a model with well over 99% accuracy.
The Fashion-MNIST was designed to be a drop-in replacement for MNIST. Fashion-MNIST is identical to MNIST except that each image is one of ten pieces of clothing:
0 T-shirt/top 1 Trouser 2 Pullover 3 Dress 4 Coat 5 Sandal 6 Shirt 7 Sneaker 8 Bag 9 Ankle boot
Fashion-MNIST is at github.com/zalandoresearch/fashion-mnist. I decided to see if my utility program that converts MNIST binary to text could be adapted to do the same for Fashion-MNIST. Bottom line: yes, creating Fashion-MNIST text files is almost exactly the same as creating MNIST text files.
Briefly:
1. manually download four gzipped-binary files from github.com/zalandoresearch/fashion-mnist/tree /master/data/fashion 2. use 7-Zip to unzip files, add ".bin" extension 3. determine format you want and modify script 4. run the script
The script works like this:
open pixels binary file for reading
open labels binary file for reading
open destination file for writing
read and discard file header info
set number images wanted
loop number images times
get label value from label file
convert label from binary to text
loop 784 times
get pixel value from pixels file
convert from binary to text
write pixel value to file
end-loop
write label value
write newline
end-loop
I intend to use the Fashion-MNIST dataset to do some experiments with warm-start training: train an MNIST classifier model from scratch, put the MNIST model weights in an empty Fashion-MNIST classifier, train the Fashion-MNIST model (warm-start), and see if the resulting classifier is belter than if the classifier had been trained from scratch.

There is a long and fascinating history of computer science related to chess. There is not a long history of fashion related to computer science. Here are three examples of fashion inspired by chess.
Code. Replace “lt”, “gt”, “lte”, “gte” with Boolean operator symbols.
# converter_f-mnist.py
# Anaconda3-2020.02 - Python 3.7.6
import numpy as np
import matplotlib.pyplot as plt
# convert Fashion MNIST binary to text file;
# combine pixels and labels
# target format:
# pixel_1 (tab) pixel_2 (tab) . . pixel_784 (tab) digit
# 0 = T-shirt/top, 1 = Trouser, 2 = Pullover
# 3 = Dress, 4 = Coat, 5 = Sandal, 6 = Shirt
# 7 = Sneaker, 8 = Bag, 9 = Ankle boot.
# 1. manually download four gzipped-binary files from
# github.com/zalandoresearch/fashion-mnist/tree
# /master/data/fashion
# 2. use 7-Zip to unzip files, add ".bin" extension
# 3. determine format you want and modify script
def convert(img_file, label_file, txt_file, n_images):
print("\nOpening binary pixels and labels files ")
lbl_f = open(label_file, "rb") # F-MNIST has labels
img_f = open(img_file, "rb") # and pixel vals separate
print("Opening destination text file ")
txt_f = open(txt_file, "w") # output file to write to
print("Discarding binary pixel and label files headers ")
img_f.read(16) # discard header info
lbl_f.read(8) # discard header info
print("\nReading binary files, writing to text file ")
print("Format: 784 pixel vals then label val, tab delimited ")
for i in range(n_images): # number images requested
lbl = ord(lbl_f.read(1)) # get label (unicode, one byte)
for j in range(784): # get 784 vals from the image file
val = ord(img_f.read(1))
txt_f.write(str(val) + "\t")
txt_f.write(str(lbl) + "\n")
img_f.close(); txt_f.close(); lbl_f.close()
print("\nDone ")
def display_from_file(txt_file, idx):
all_data = np.loadtxt(txt_file, delimiter="\t",
usecols=range(0,785), dtype=np.int64)
x_data = all_data[:,0:784] # all rows, 784 cols
y_data = all_data[:,784] # all rows, last col
label = y_data[idx]
print("label = ", str(label), "\n")
pixels = x_data[idx]
pixels = pixels.reshape((28,28))
for i in range(28):
for j in range(28):
# print("%.2X" % pixels[i,j], end="")
print("%3d" % pixels[i,j], end="")
print(" ", end="")
print("")
plt.tight_layout()
plt.imshow(pixels, cmap=plt.get_cmap('gray_r'))
plt.show()
# -----------------------------------------------------------
# -----------------------------------------------------------
def main():
n_images = 1000
print("\nCreating %d F-MNIST train images from binary files "\
% n_images)
convert(".\\UnzippedBinary\\train-images-idx3-ubyte.bin",
".\\UnzippedBinary\\train-labels-idx1-ubyte.bin",
"f-mnist_train_1000.txt", 1000)
n_images = 100
print("\nCreating %d F-MNIST test images from binary files " %\
n_images)
convert(".\\UnzippedBinary\\t10k-images-idx3-ubyte.bin",
".\\UnzippedBinary\\t10k-labels-idx1-ubyte.bin",
"f-mnist_test_100.txt", 100)
print("\nShowing train image [0]: ")
img_file = ".\\f-mnist_train_1000.txt"
display_from_file(img_file, idx=0) # first image
if __name__ == "__main__":
main()

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.