Probabilistic learning of Turkish morphosemantics by latent syntax

Thesis Type: Postgraduate

Institution Of The Thesis: Orta Doğu Teknik Üniversitesi, Graduate School of Informatics, Cognitive Science, Turkey

Approval Date: 2017




The language processing capability of humans is highly dependent on the transparent interface between syntax and semantics which is formalized as the grammar. Morphology also interferes with this interface, in languages having rich morphology such as Turkish. This thesis aims to discover word semantics in Turkish from the compositional morphosemantics by underlying latent syntax. A computational model has been developed to learn a morpheme lexicon in which each morpheme contains semantic information in logical form with a basic syntactic type. A knowledge-free segmentation algorithm based on distributional properties of words is used to extract pseudo-morphemes from words. We utilize a classical probabilistic CCG grammar for lexical learning. Since derivational changes can be handled with lexicalization of words, we employ our model for the inflectional morphemes in Turkish. The model has been tested and results obtained is reported in the thesis with various aspects.