Word Embedding Based Event Detection on Social Media

Ertugrul A. M., Velioglu B., KARAGÖZ P.

12th International Conference on Hybrid Artificial Intelligent Systems (HAIS), Logrono, İspanya, 21 - 23 Haziran 2017, cilt.10334, ss.3-14, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 10334
Doi Numarası: 10.1007/978-3-319-59650-1_1
Basıldığı Şehir: Logrono
Basıldığı Ülke: İspanya
Sayfa Sayıları: ss.3-14
Anahtar Kelimeler: Event detection, Neural feature extraction, Word embedding, Neural probabilistic language models
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Event detection from social media messages is conventionally based on clustering the message contents. The most basic approach is representing messages in terms of term vectors that are constructed through traditional natural language processing (NLP) methods and then assigning weights to terms generally based on frequency. In this study, we use neural feature extraction approach and explore the performance of event detection under the use of word embeddings. Using a corpus of a set of tweets, message terms are embedded to continuous space. Message contents that are represented as vectors of word embedding are grouped by using hierarchical clustering. The technique is applied on a set of Twitter messages posted in Turkish. Experimental results show that automatically extracted features detect the contextual similarities between tweets better than traditional feature extraction with term frequency-inverse document frequency (TF-IDF) based term vectors.