CLUSTERING BASED PERSONALITY PREDICTION ON TURKISH TWEETS


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Türkiye

Tezin Onay Tarihi: 2019

Tezin Dili: İngilizce

Öğrenci: ESEN TUTAYSALGIR

Asıl Danışman (Eş Danışmanlı Tezler İçin): İsmail Hakkı Toroslu

Eş Danışman: Pınar Karagöz

Özet:

In this thesis, we present a framework for predicting the personality traits of users using their tweets written in Turkish. The prediction model is constructed with a clustering based approach. We show how to extract linguistic features from tweet data and to adapt TF-IDF weighting and word embeddings to the Turkish tweets. Since the model is based on linguistic features, it is language specific. The prediction model uses features applicable to Turkish language and related to writing style of Turkish Twitter users. Our approach uses anonymous BIG5 questionnaire scores of volunteer participants as ground truth in order to generate personality model from Twitter posts. Experiment results show that constructed model can predict personality traits of Turkish Twitter users with relatively small errors.