Converting an Image into Spike Trains for a Spiking Neural Network

A spiking neural network (SNN) is one where the artificial neurons more closely model biological neurons than the neurons in a traditional artificial neural network (ANN). An SNN accepts a stream of 0s and 1s called a spike train, as opposed to ANNs which accept floating-point values such as 0.3827.

Suppose you want to create a neural classifier for images. Suppose the image is 28 by 28 = 784 pixels and each pixel value is a number (or possibly three numbers representing red, green, blue). If you use a traditional ANN you can just create a network with 784 input nodes.

But if you want to use an SNN, it’s not entirely obvious how to represent the image input data as a spike train. A well-known benchmark dataset for image recognition is called the MNIST (modified National Institute of Standards and Technology) handwritten digit database. The dataset consists of 70,000 28 x 28 pixel images of a single digit (0 to 9). Each pixel is a grayscale intensity value between 0 = 00h (no stroke) and 255 = FFh (dark stroke). The dataset is divided into a 60,000-item set for training (6,000 instances of each digit) and a 10,000-item set for testing (1,000 instances of each digit). An example of a ‘5’ digit is shown below.

This is the first image in the MNIST images training data.

One approach for representing an MNIST image using spike trains is to create a set of 28 * 28 = 784 spike train sequences where each sequence has a firing rate that is proportional to the pixel value, as demonstrated in my blog post at https://jamesmccaffreyblog.com/2019/12/13/generating-artificial-spike-trains-for-spiking-neural-networks/. For example, pixels with small values will be assigned a small firing rate and generate mostly 0-values in their associated spike train. Pixel values that are moderate in magnitude, such as 126 = 7Eh, will be assigned a moderate firing rate and generate roughly 50% spiking 1 values and 50% non-spiking 0 values. Pixel values that are large, such as 255 = FFh will be assigned a large firing rate and generate spike trains that are mostly 1 values.

Because the resulting spike trains have a random component, you’d need to create very long spike trains for each pixel, as opposed to just a handful. This corresponds to the way a biological retina of a person with their eyes open is continuously processing a stream of input information.

Three images that were used for the covers of 1960s era science fiction magazines by artist Ed Emshwiller (1925 – 1990). He won four Hugo awards for science fiction art. It’s unlikely that anyone in the 1960s could have predicted the image recognition technologies that exist today.