Investigating the performance of segmentation methods with deep learning models for sentiment analysis on Turkish informal texts


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2018

Öğrenci: FATİH KURT

Danışman: PINAR KARAGÖZ

Özet:

This work investigates segmentation approaches for informal short texts in morphologically rich languages in order to e ectively classify the sentiment. The two building blocks of the proposed work in this thesis are segmentation and deep neural network model building. Segmentation focuses on preprocessing of text with di erent methodologies. These methodologies are grouped under four distinct approaches; namely, morphological, sub-word, tokenization, and hybrid approaches. There is mostly multiple numbers of variants for each of these four methods provided in this work. The second stage focuses on e ective model building for classifying text. Performances of each method are evaluated by utilizing a model built by a Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) model proposed in the literature for text classi cation.