Quite some time ago, I was working with data clustering algorithms. Data clustering is the process of grouping data so that similar items are in the same group/cluster, and also clusters are different from each other.
There are many different clustering algorithms. The two most common algorithms are called k-means clustering (and a minor variation called k-means++), and Gaussian mixture model clustering. These two algorithms only work with strictly numeric data, so you can’t have variables like sex = (male, female) or hair_color = (brown, blonde, black, red, gray).
Note that clustering non-numeric data is surprisingly tricky. I devised several clustering algorithms for non-numeric data, and wrote up technical articles:
Data Clustering Using Category Utility:
https://msdn.microsoft.com/en-us/magazine/dn198247.aspx
Data Clustering Using Entropy Minimization
https://visualstudiomagazine.com/articles/2013/02/01/data-clustering-using-entropy-minimization.aspx
Data Clustering Using Naive Bayes Inference
https://msdn.microsoft.com/en-us/magazine/jj991980.aspx
Anyway, one of the lesser-known clustering algorithms for numeric data is called fuzzy C-means clustering. The key idea of fuzzy clustering is that instead of assigning a data item to one of the k classes definitively, a data item is assigned one membership value for each possible cluster where the membership values indicate the degree to which the item belongs to each cluster.
Suppose you set k = 3, and each data item represents a person’s height and weight. Then the result of fuzzy C-means clustering might look something like:
Height Weight k=0 k=1 k=2 ================================= 65.0 120.0 0.82 0.08 0.10 72.0 185.0 0.10 0.30 0.60 . . .
Here, the person whose (height, weight) is (65.0, 120.0) has mostly membership in the k=0 cluster.
Well, in the end, fuzzy C-means data clustering isn’t used very much because the additional information you get is somewhat difficult to interpret and use.

Fuzzy hats. Sometimes nice on women, sometimes not. But never, ever nice on guys.


.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.