Fuzzy clustering¶

Where crisp k-means assigns each point to exactly one cluster, fuzzy clustering gives each point a graded membership in every cluster. fuzzytool.cluster works on plain NumPy arrays of shape (n_samples, n_features) and offers three algorithms plus validity metrics. Every algorithm takes a seed for reproducibility.

Function	Norm / idea	Good for
`fuzzy_cmeans`	Euclidean, memberships sum to 1	spherical clusters
`gustafson_kessel`	adaptive per-cluster Mahalanobis norm	ellipsoidal clusters
`possibilistic_cmeans`	typicalities (no sum-to-1)	noisy data / outliers

import fuzzytool as fz
from fuzzytool import cluster
from fuzzytool.datasets import make_blobs

X = make_blobs(centers=((0, 0), (6, 6), (0, 6)), seed=0)

res = fz.fuzzy_cmeans(X, c=3, m=2.0, seed=0)
res.centers     # (3, 2) cluster prototypes
res.u           # (3, n) membership matrix (columns sum to 1)
res.labels      # hard assignment, argmax over u
res.n_iter      # iterations to converge

Validity metrics¶

How many clusters? Compare runs with internal validity indices:

cluster.partition_coefficient(res.u)   # in (1/c, 1]; higher = crisper
cluster.partition_entropy(res.u)       # in [0, log c); lower = crisper
cluster.xie_beni(X, res.centers, res.u)  # compactness/separation; lower = better

A common recipe is to sweep c and pick the value that maximizes the partition coefficient (or minimizes Xie-Beni).

Visualization¶

import matplotlib.pyplot as plt
from fuzzytool import viz

viz.plot_clusters(X, res)   # 2-D scatter; opacity encodes top membership
plt.show()

Fuzzy c-means on three blobs: points colored by cluster, centers marked, opacity encoding top membership