A Central Limit Theorem for $k$-Means Clustering
Pollard, David
Ann. Probab., Tome 10 (1982) no. 4, p. 919-926 / Harvested from Project Euclid
A set of $n$ points in Euclidean space is partitioned into the $k$ groups that minimize the within groups sum of squares. Under the assumption that the $n$ points come from independent sampling on a fixed distribution, conditions are found to assure asymptotic normality of the vector of means of the $k$ groups. The method of proof makes novel application of a functional central limit theorem for empirical processes--a generalization of Donsker's theorem due to Dudley.
Publié le : 1982-11-14
Classification:  $k$-means clustering,  central limit theorem,  minimized within cluster sum of squares,  differentiability in quadratic mean,  Donsker classes of functions,  functional central limit theorem for empirical processes,  62H30,  60F05,  60F17
@article{1176993713,
     author = {Pollard, David},
     title = {A Central Limit Theorem for $k$-Means Clustering},
     journal = {Ann. Probab.},
     volume = {10},
     number = {4},
     year = {1982},
     pages = { 919-926},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176993713}
}
Pollard, David. A Central Limit Theorem for $k$-Means Clustering. Ann. Probab., Tome 10 (1982) no. 4, pp.  919-926. http://gdmltest.u-ga.fr/item/1176993713/