I’ve been looking at data clustering recently. Suppose you have n=5 items (A, B, C, D, E) and you want to cluster them into k=2 groups. How many different possible clusterings are there?
The answer is that there are 15 ways to cluster:
(A) (B, C, D, E) (B) (A, C, D, E) (C) (A, B, D, E) (D) (A, B, C, E) (E) (A, B, C, D) (A, B) (C, D, E) (A, C) (B, D, E) (A, D) (B, C, E) (A, E) (B, C, D) (B, C) (A, D, E) (B, D) (A, C, E) (B, E) (A, C, D) (C, D) (A, B, E) (C, E) (A, B, D) (D, E) (A, B, C)
Notice that the order of items within a cluster isn’t relevant. OK, but how many ways are there to cluster n=20 items into k=3 clusters? The number of ways is given by an equation called Stirling numbers of the second kind. The equation is:
As n gets bigger, the number of possible clusterings becomes insanely large very quickly. For n=20 and k=3, there are 580,606,446 ways to cluster.
The j index runs from 0 to 3. The individual terms in the summation are:
j=0: (-1)^(3-0) * C(3,0) * 0^20
= -1 * 1 * 0
= 0
j=1: (-1)^(3-1) * C(3,1) * 1^20
= 1 * 3 * 1
= 3
j=2: (-1)^(3-2) * C(3,2) * 2^20
= -1 * 3 * 1,048,576
= -3,145,728
j=3: (-1)^(3-3) * C(3,3) * 3^20
= 1 * 1 * 3,486,784,401
= 3,486,784,401
The sum over all j is 0 + 3 – 3,145,728 + 3,486,784,401 = 3,483,638,676 and dividing that sum by k! = 3! = 6 gives 580,606,446 ways.
The number of ways to cluster is dominated by the last term and can be used as a rough approximation. So if you have n=100 items and k=20 clusters, there are approximately 20^100 possible clustering — 20^100 is about 1.0e+130 — an impossibly large number — vastly greater than the estimated number of atoms in the universe!

Man’s place in the universe – a scary topic. But I truly believe in the existence of The Multiverse and that we all have lived infinite lives in the past and that we will live infinite lives in the future.

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.