Recent methods on short text stream clustering: A survey study


Maden E., Karagöz P.

Wiley Interdisciplinary Reviews: Computational Statistics, cilt.15, sa.6, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 6
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1002/wics.1610
  • Dergi Adı: Wiley Interdisciplinary Reviews: Computational Statistics
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Compendex
  • Anahtar Kelimeler: Dirichlet process, short text stream clustering, text similarity, word relation network
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

The volume and the velocity of data in social media are increasing and the social media has become a very useful environment to detect and track the real-world events. However, to fulfill this, it is crucial to group-related texts according to their topics and clustering takes an essential role at this point since we have no prior knowledge about the topics and their evolution in social media. In this survey, we review the current approaches and techniques proposed for short text stream clustering in recent years. The reviewed techniques are grouped according to their methodology and discussed in detail. Also, the datasets utilized to evaluate the performance of the proposed methods and the results are summarized together with the clustering quality measures used for these evaluations. Furthermore, current challenges about short-text stream clustering are discussed. This article is categorized under: Data: Types and Structure > Streaming Data.