A common benchmark in machine learning is the Banknote Authentication Dataset. The data looks like:
3.6216, 8.6661, -2.8073, -0.44699, 0 4.5459, 8.1674, -2.4586, -1.46210, 0 . . . -2.5419, -0.65804, 2.6842, 1.19520, 1
Each line of data represents a banknote (think euro or dollar bill). The first four values are the predictors, and the fifth value is 0 for authentic or 1 for forgery. The predictor values come from a digital image of each banknote and are the image variance, skewness, kurtosis, and entropy. Just what are these predictors?
I’ll explain what kurtosis is. You probably know that the variance of a set of numbers is a measure of how different the numbers are from their mean. If you graph your numbers, larger variance has a more spread-out graph and smaller variance has a less spread-out graph. Kurtosis extends this idea. You can think of kurtosis as how “spiky” a graph of your data is. Not really. See the Comment section below. A graph that looks exactly like the bell-shaped normal Gaussian distribution has kurtosis k = 0. If a graph is more peaked or spiky than Gaussian, k is a number greater than 0. If a graph is flatter than Gaussian, k is a number less than 0. Note that you can compute kurtosis without creating a graph. The word “kurtosis” comes directly from the ancient Greek, meaning “bulging” or “convex”.

Left: Examples that show kurtosis is a measure of how spiky a graph of a set of data is. Right: The original, uncompressed image of Abraham Lincoln that I used.
The math equation to compute kurtosis is:
k = average([(x – u) / sd]^4) – 3.0
In words, first compute the mean (u) and standard deviation (sd) of all the numbers x. Then compute (x-u)/sd for each of the numbers. Then take each of those values and raise to the fourth power. Then take the average of all the values and subtract 3.
Put another way, kurtosis is just one of many summary statistics that describe a set of numbers. So, the kurtosis of an image is just the kurtosis computed on the image’s pixel values.
I decided to explore by computing the kurtosis of an image in three ways: using a custom Python function, using the built-in kurtosis() function in the scipy library, and using Excel.

I computed the kurtosis of a condensed version of Abe using the built-in scipy kurtosis() and a custom my_kurtosis() function.

I computed the kurtosis of compressed Abe using Excel too, in order to validate my code results.
I took a grayscale image of Abraham Lincoln from the Internet and used an online tool to condense the image to 8×8 pixels, where each pixel is a value between 0 and 255. Then I used another online tool to extract the 64 pixels values of the condensed image:
158 105 36 12 0 77 145 146 143 46 103 101 139 85 56 158 120 136 255 236 219 135 60 164 125 118 133 143 106 53 31 150 151 138 134 159 119 100 57 167 168 137 186 181 127 98 114 198 156 139 103 145 78 36 165 198 144 155 51 51 23 34 185 194
Next, I wrote a Python program that computed the kurtosis of the 64-byte image using scipy.stats.kurtosis() and also a custom my_kurtosis() function. In both cases I got k = -0.3697. My custom function is simple because I used NumPy to do most of the work:
def my_kurtosis(d): # d is a np matrix avg = np.mean(d) # avg of all values sd = np.std(d) # sd of all values tmp = (d - avg) / sd # a matrix of values p = np.mean(tmp * tmp * tmp * tmp) # matrix return p - 3 # minus 3 is Fisher version
The minor moral to this story is that in machine learning you’ll often come across new terminology and ideas, and you’ll feel like you’re sinking under the technical waves. But if you take some time, all ML ideas are understandable.
The most famous Greek god-hero is Hercules. There have been dozens of Hercules movies, including several where Hercules ends up in Atlantis (before it sank). Left: “Hercules and the Conquest of Atlantis” (1961) – Also known as “Hercules and the Captive Women”. Hercules is shipwrecked on Atlantis and recues Princess Ismene. Center: “The Conqueror of Atlantis” (1965) – Hercules finds a scientifically advanced Atlantis under the Sahara desert and defeats bad guys and robots. Right: “Hercules: The Legendary Journeys” (1995-1999) – A TV series. In season 3, episode 22, Hercules is shipwrecked on Atlantis and tries to warn the citizens that over-mining has weakened the island’s foundation.

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
But you can have a distribution that is infinitely spiky but with negative kurtosis. Take the beta(.5,1) distribution, for example. Further, you can have a distribution that is perfectly flat over 99.999% of the data, but with infinite kurtosis. Kurtosis measures tails, not spikiness of the peak.
Peter, You are quite correct. There are several discussions on the Internet about the “peaked-ness controversy” interpretation of kurtosis. When I was a university teacher in the 1990s, my students would never understand the “tailed-ness” explanation so I’d usually use the not-quite-correct but easier to visualize “spikiness” explanation. JM
If you use the normal q-q plot, “tailedness” is very easy to see …