Web: http://arxiv.org/abs/2202.03949

Sept. 19, 2022, 1:12 a.m. | Carlo Baldassi

stat.ML updates on arXiv.org arxiv.org

We present a meta-method for initializing (seeding) the $k$-means clustering
algorithm called PNN-smoothing. It consists in splitting a given dataset into
$J$ random subsets, clustering each of them individually, and merging the
resulting clusterings with the pairwise-nearest-neighbor (PNN) method. It is a
meta-method in the sense that when clustering the individual subsets any
seeding algorithm can be used. If the computational complexity of that seeding
algorithm is linear in the size of the data $N$ and the number of clusters …

algorithms arxiv

