Recent methods on short text stream clustering: A survey study

Maden E., Karagöz P.

Wiley Interdisciplinary Reviews: Computational Statistics, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2023
  • Doi Number: 10.1002/wics.1610
  • Journal Name: Wiley Interdisciplinary Reviews: Computational Statistics
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Compendex
  • Keywords: Dirichlet process, short text stream clustering, text similarity, word relation network
  • Middle East Technical University Affiliated: Yes


The volume and the velocity of data in social media are increasing and the social media has become a very useful environment to detect and track the real-world events. However, to fulfill this, it is crucial to group-related texts according to their topics and clustering takes an essential role at this point since we have no prior knowledge about the topics and their evolution in social media. In this survey, we review the current approaches and techniques proposed for short text stream clustering in recent years. The reviewed techniques are grouped according to their methodology and discussed in detail. Also, the datasets utilized to evaluate the performance of the proposed methods and the results are summarized together with the clustering quality measures used for these evaluations. Furthermore, current challenges about short-text stream clustering are discussed. This article is categorized under: Data: Types and Structure > Streaming Data.