Most of the existing semi-supervised clustering algorithms depend on pairwise constraints, and they usually use lots of priori knowledge to improve their accuracies. In this paper, we use another semi-supervised method called label propagation to help detect clusters. We propose two new semi-supervised algorithms named K-SSMST and M-SSMST. Both of them aim to discover clusters of diverse density and arbitrary shape. Based on Minimum Spanning Tree's algorithm variant, K-SSMST can automatically find natural clusters in a dataset by using K labeled data objects where K is the number of clusters. M-SSMST can detect new clusters with insufficient semi-supervised information. Our algorithms have been tested on various artificial and UCI datasets. The results demonstrate that the algorithm's accuracy is better than other supervised and semi-supervised approaches.
Publié le : 2013-01-30
Classification:  Data mining, semi-supervised learning, clustering, label propagation, MST,  62H30, 91C20
@article{cai1331,
     author = {Xiaoyun Chen; School of Informationh Science and Engineering, Lanzhou University and Mengmeng Huo; School of Informationh Science and Engineering, Lanzhou University and Yangyang Liu; School of Informationh Science and Engineering, Lanzhou University},
     title = {MST-Based Semi-Supervised Clustering Using M-Labeled Objects},
     journal = {Computing and Informatics},
     volume = {31},
     number = {6},
     year = {2013},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai1331}
}
Xiaoyun Chen; School of Informationh Science and Engineering, Lanzhou University; Mengmeng Huo; School of Informationh Science and Engineering, Lanzhou University; Yangyang Liu; School of Informationh Science and Engineering, Lanzhou University. MST-Based Semi-Supervised Clustering Using M-Labeled Objects. Computing and Informatics, Tome 31 (2013) no. 6, . http://gdmltest.u-ga.fr/item/cai1331/