Self-learning K-means clustering: a global optimization approach


Volkovich Z., Toledano-Kitai D., Weber G. -.

JOURNAL OF GLOBAL OPTIMIZATION, cilt.56, sa.2, ss.219-232, 2013 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 56 Sayı: 2
  • Basım Tarihi: 2013
  • Doi Numarası: 10.1007/s10898-012-9854-y
  • Dergi Adı: JOURNAL OF GLOBAL OPTIMIZATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.219-232
  • Anahtar Kelimeler: First keyword, Second keyword, More, NUMBER, INFORMATION, VALIDATION
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

An appropriate distance is an essential ingredient in various real-world learning tasks. Distance metric learning proposes to study a metric, which is capable of reflecting the data configuration much better in comparison with the commonly used methods. We offer an algorithm for simultaneous learning the Mahalanobis like distance and K-means clustering aiming to incorporate data rescaling and clustering so that the data separability grows iteratively in the rescaled space with its sequential clustering. At each step of the algorithm execution, a global optimization problem is resolved in order to minimize the cluster distortions resting upon the current cluster configuration. The obtained weight matrix can also be used as a cluster validation characteristic. Namely, closeness of such matrices learned during a sample process can indicate the clusters readiness; i.e. estimates the true number of clusters. Numerical experiments performed on synthetic and on real datasets verify the high reliability of the proposed method.