A Unified Approach for Center-Based Clustering Problems on Networks


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Endüstri Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2019

Öğrenci: DERYA İPEK EROĞLU

Danışman: CEM İYİGÜN

Özet:

In this thesis, Center-Based Clustering Problems on Networks are studied. Four different problems are considered differing in the assignment scheme of the data points and the objective function. Two different assignment schemes are considered, hard assignment and soft assignment. In hard assignment, data points (vertices) are strictly assigned to one cluster, while in soft assignment, vertices are assigned to the multiple clusters with a membership probability. Objective function of a clustering problem could be categorized as minimizing sum of distances or sum of squared distances between the vertices and the centers of clusters they are assigned to. In this study, cluster centers are not restricted to vertices. They are allowed to be located on vertices or anywhere on the edges. The problems that are studied are analyzed in terms of properties of the cluster centers, and theoretical results are derived. Benefiting from these properties, a unified solution framework is developed which is named Hybrid Genetic Algorithm (HGA), a genetic algorithm with a Local Search operation which uses the theoretical results obtained about the cluster centers. Two versions of HGA, namely Node Based HGA (HGA-N) and Edge Based HGA (HGAv E) are developed by modifying HGA considering the derived properties. To test the performance of the proposed algorithms, numerical experiments are conducted on clustering of datasets from the literature and the simulated ones. Results are compared with the optimal or best solutions reported in the literature (if available). The proposed algorithms are also compared with the well-known heuristics used for the planar clustering problems. These heuristics are modified for the network problems. Computational results show that the proposed approach performs well in all clustering problems studied.