Does the Strength of Sentiment Matter? A Regression Based Approach on Turkish Social Media

Ertugrul A. M., Onal I., ACARTÜRK C.

22nd International Conference on Applications of Natural Language to Information Systems (NLDB), Liege, Belgium, 21 - 23 June 2017, vol.10260, pp.149-155 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 10260
  • Doi Number: 10.1007/978-3-319-59569-6_16
  • City: Liege
  • Country: Belgium
  • Page Numbers: pp.149-155
  • Middle East Technical University Affiliated: Yes


Social media posts are usually informal and short in length. They may not always express their sentiment clearly. Therefore, multiple raters may assign different sentiments to a tweet. Instead of employing majority voting which ignores the strength of sentiments, the annotation can be enriched with a confidence score assigned for each sentiment. In this study, we analyze the effect of using regression on confidence scores in sentiment analysis using Turkish tweets. We extract hand-crafted features including lexical features, emoticons and sentiment scores. We also employ word embedding of tweets for regression and classification. Our findings reveal that employing regression on confidence scores slightly improves sentiment classification accuracy. Moreover, combining word embedding with hand-crafted features reduces the feature dimensionality and outperforms alternative feature combinations.