Online event detection from streaming data


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2018

Öğrenci: ÖZLEM CEREN ŞAHİN

Danışman: PINAR KARAGÖZ

Özet:

The purpose of this study is detecting events from social media in an online fashion where event is a happening that takes place at a certain time and place that attracts attention within a short period of time. By doing so, it is aimed to provide a system both accurate and efficient at the same time. The problem studied in this thesis is modeled as a stream processing problem and three alternative methods are proposed. The first event detection method is keyword-based and works with bursty keywords inside social media messages. The second method is clustering-based method and suggests an improved version of hierarchical clustering algorithms. The last one is a hybrid method which merges the previous two methods. All the methods introduced are implemented on top of Apache Storm and Cassandra to provide a distributed and scalable system, and each method has the ability to distinguish data belonging to different countries and events are tagged with country information. Each method is evaluated experimentally in terms of both accuracy and performance based on a real dataset with 12M tweet messages collected from Twitter.