Twitter account classification using account metadata: organization vs. individual


Çetinkaya Y. M., Gürlek M., Toroslu I. H., Karagöz P.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.30, sa.4, ss.1404-1418, 2022 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 30 Sayı: 4
  • Basım Tarihi: 2022
  • Doi Numarası: 10.55730/1300-0632.3856
  • Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.1404-1418
  • Anahtar Kelimeler: Twitter, account classification, organization vs, individual, account metadata, HYBRID APPROACH
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Organizations present their existence on social media to gain followers and reach out to the crowds. Social media-related tasks and applications, such as social media graph construction, sentiment analysis, and bot detection, are required to identify the entities’ account types. Some applications focus on personal accounts, whereas others only need nonpersonal accounts. This paper addresses the account classification problem using only minimum amount of data, which is the metadata of the account’s profile. The proposed approach classifies accounts either as organization or individual, in a language-independent manner, without collecting the accounts’ tweet content. The model uses a long short term memory (LSTM) network for processing the textual properties and a fully-connected neural network for processing the numerical features. We apply our solution to a collection of Twitter accounts, as it is one of the most widely used social networks. Our classifier, based solely on the account metadata, achieves an average of 97.4% accuracy under 7-fold cross-validation. The experiments show that the account metadata is a qualified resource for accurately estimating the account types.