CINXE.COM
Improved KModes for Categorical Clustering Using Weighted Dissimilarity Measure
<?xml version="1.0" encoding="UTF-8"?> <article key="pdf/8177" mdate="2009-03-25 00:00:00"> <author>S.Aranganayagi and K.Thangavel</author> <title>Improved KModes for Categorical Clustering Using Weighted Dissimilarity Measure</title> <pages>729 - 735</pages> <year>2009</year> <volume>3</volume> <number>3</number> <journal>International Journal of Computer and Information Engineering</journal> <ee>https://publications.waset.org/pdf/8177</ee> <url>https://publications.waset.org/vol/27</url> <publisher>World Academy of Science, Engineering and Technology</publisher> <abstract>KModes is an extension of KMeans clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for KModes, based on the ratio of frequency of attribute values in the cluster and in the data set. The new weighted measure is experimented with the data sets obtained from the UCI data repository. The results are compared with KModes and Krepresentative, which show that the new measure generates clusters with high purity. </abstract> <index>Open Science Index 27, 2009</index> </article>