Improvement of corpus-based semantic word similarity using vector space model


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2009

Öğrenci: YUNUS EMRE ESİN

Danışman: FERDA NUR ALPASLAN

Özet:

This study presents a new approach for finding semantically similar words from corpora using window based context methods. Previous studies mainly concentrate on either finding new combination of distance-weight measurement methods or proposing new context methods. The main di fference of this new approach is that this study reprocesses the outputs of the existing methods to update the representation of related word vectors used for measuring semantic distance between words, to improve the results further. Moreover, this novel technique provides a solution to the data sparseness of vectors which is a common problem in methods which uses vector space model. The main advantage of this new approach is that it is applicable to many of the existing word similarity methods using the vector space model. The other and the most important advantage of this approach is that it improves the performance of some of these existing word similarity measuring methods.