Tezin Türü: Yüksek Lisans
Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye
Tezin Onay Tarihi: 2014
Tezin Dili: İngilizce
Öğrenci: Burak Kerim Akkuş
Danışman: RUKET ÇAKICI
Özet:Combinatory Categorial Grammar (CCG) categories contain syntactic and semantic information. CCG derivation trees can be used in extracting partial dependency structures by providing the missing information in order to build complete dependency structures. Therefore, CCG categories are sometimes referred to as supertags. The amount of information encoded in supertags makes it possible to create very accurate and fast parsers as supertagging is considered ``almost parsing''. In this thesis, a maximum entropy based part of speech tagger is presented to improve the performance of CCG supertagging and another maximum entropy classifier is implemented with additional features for supertagging. Morphological features of words of an agglutinative language such as Turkish are used in order to improve the accuracy of POS tagging and supertagging processes. This indicates direct relationships between morphemes and lexical categories. The effects of using the improved supertagger are tested on dependency parsers by means of using supertags as rich parts of speech tags. Additionally, using POS taggers that assign multiple part of speech tags to the ambiguous words is suggested as another potential improvement for supertaggers.