Thesis Type: Postgraduate
Institution Of The Thesis: Orta Doğu Teknik Üniversitesi, Faculty of Engineering, Department of Computer Engineering, Turkey
Approval Date: 2017
Student: SEFA ŞAHİN KOÇ
Supervisor: İSMAİL HAKKI TOROSLUAbstract:
Real world data is complex and multi-related among itself. Considering a social media, multiple users can interact with same item such as commenting, liking etc. Data composed of these actions contains many nodes from different types (user, item, sentiment). Therefore, clustering nodes with same type will not be sufficient to analyze it. It will ignore relations between nodes from different types. Such data should be dealt with heterogeneous multi-partite clustering methods. Thus, clustering does not ignore relations among different types. At the end, heterogeneous clusters are found, which are effective to represent interpartition relations as well as intra-partition ones. To exemplify, from a complex big relations of , clusters may be extracted such that they contains users who uses similar sentiments to address same issues. I present a new algorithm, called STriCluster, which evaluates heterogeneous data which contains relations of three different types. Each relation is called an hyperedge where each links three nodes from distinct types. Moreover, hyperedges carry a sentiment, which is either positive or negative. The algorithm finds tripartite clusters which express high positivity. Overlap of hyperedges among clusters are not allowed while a node can be part of many clusters. Furthermore, our algorithm handles negative property and sparseness of hyperedges while discovering tripartite clusters of hyperedges with positive properties. I will show its effectiveness via experiments and results. Experiments are performed on both synthetic and real-world data.