The Gower distance is a metric that measures the dissimilarity of two items with mixed numeric and non-numeric data. Gower distance is also called Gower dissimilarity. One possible use of Gower distance is with k-means clustering with mixed data because k-means needs the numeric distance between data items.
Briefly, to compute the Gower distance between two items you compare each element and compute a term. If the element is numeric, the term is the absolute value of the difference divided by the range. If the element is non-numeric the term is 1 if the elements are different or the term is 0 if the elements are the same. The Gower distance is the average of the terms.
Suppose you have four data items where each item is a person. There are 6 elements: age, race, height, income, IsMale, politic. The elements age, height, and income are numeric. Elements race, IsMale, and politic are non-numeric.
Age Race Height Income IsMale Politic
(n) (n) (n)
[1] 22 1 3 0.39 TRUE moderate
[2] 33 3 1 0.34 TRUE liberal
[3] 52 1 2 0.51 FALSE moderate
[4] 46 6 3 0.63 TRUE conservative
range 30 NA 2 0.29 NA NA
The distance between person [1] and person [2] is 0.590, calculated like so:
Age Race Ht Inc Male Politic
[1] = (22, 1, 3, 0.39, True, moderate)
[2] = (33, 3, 1, 0.34, True, liberal)
numeric: abs(diff) / range
non-numeric: 0 if equal, 1 if different
dist([1], [2]) =
Age: abs((22 - 33) / 30) = 0.367
Race: (different) = 1
Height: abs((3 - 1) / 2) = 1.000
Inc: abs((0.39 - 0.34) / 0.29) = 0.172
IsMale: (same) = 0
Politic: (different) = 1
= (0.367 + 1 + 1.000 + 0.172 + 0 + 1) / 6
= 3.539 / 6
= 0.590
Noice that each individual term will be between 0.0 and 1.0 inclusive, therefore the Gower distance will always be between 0.0 and 1.0 where a distance of 0.0 means the two items are the same and a distance of 1.0 means the two items are as far apart as possible, relative to the source dataset.
The Gower distance can be used with purely numeric or purely non-numeric data, but for such scenarios there are better distance metrics available.
There are several variations of the Gower distance, so if you encounter it, you should read the documentation carefully. For example, for some scenarios you might want to weight each term to give more/less importance to that term.
You can’t measure the emotional distance between two people. Three paintings that illustrate this, including the famous “Nighthawks at the Diner” by Edward Hopper (1882 – 1967).

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
Thanks for the post! Do you have a recommendation for purely non-numeric (categorical, ordinal) variables? 🙂
This is very tricky. For datasets that contain purely non-numeric data, in most scenarios my colleagues and I use a neural autoencoder. First you convert binary data to 0 or 1 (or possibly -1 or +1) and you convert categorical data to one-hot encoded (for example, if “color” can be “red”, “blue”, or “green” then red = (1,0,0), blue = (0,1,0), green = (0,0,1)). Then you feed the all the data items to an autoencoder which will create a purely numeric representation of each data item. Then you can use numeric representations to compute a difference (typically Euclidean distance).
Years ago, I looked at problems like this very closely, for several years. It’s incredibly tricky.
See also https://jamesmccaffrey.wordpress.com/2021/10/04/computing-the-similarity-between-two-machine-learning-datasets-in-visual-studio-magazine/ — for computing the difference between datasets (as opposed to the difference between two items).