Probabilistic learning of Turkish morphosemantics by latent syntax


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Enformatik Enstitüsü, Bilişsel Bilimler Anabilim Dalı, Türkiye

Tezin Onay Tarihi: 2017

Öğrenci: AHMET ÜSTÜN

Danışman: HÜSEYİN CEM BOZŞAHİN

Özet:

The language processing capability of humans is highly dependent on the transparent interface between syntax and semantics which is formalized as the grammar. Morphology also interferes with this interface, in languages having rich morphology such as Turkish. This thesis aims to discover word semantics in Turkish from the compositional morphosemantics by underlying latent syntax. A computational model has been developed to learn a morpheme lexicon in which each morpheme contains semantic information in logical form with a basic syntactic type. A knowledge-free segmentation algorithm based on distributional properties of words is used to extract pseudo-morphemes from words. We utilize a classical probabilistic CCG grammar for lexical learning. Since derivational changes can be handled with lexicalization of words, we employ our model for the inflectional morphemes in Turkish. The model has been tested and results obtained is reported in the thesis with various aspects.