Turkish Resources for Visual Word Recognition

Erten B., BOZŞAHİN H. C., ZEYREK BOZŞAHİN D.

9th International Conference on Language Resources and Evaluation (LREC), Reykjavik, İzlanda, 26 - 31 Mayıs 2014, ss.2106-2110, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: Reykjavik
Basıldığı Ülke: İzlanda
Sayfa Sayıları: ss.2106-2110
Anahtar Kelimeler: morphology, lexical statistics, pseudoword generation, PROGRAM
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

We report two tools to conduct psycholinguistic experiments on Turkish words. KelimetriK allows experimenters to choose words based on desired orthographic scores of word frequency, bigram and trigram frequency, ON, OLD20, ATL and subset/superset similarity. Turkish version of Wuggy generates pseudowords from one or more template words using an efficient method. The syllabified version of the words are used as the input, which are decomposed into their sub-syllabic components. The bigram frequency chains are constructed by the entire words' onset, nucleus and coda patterns. Lexical statistics of stems and their syllabification are compiled by us from BOUN corpus of 490 million words. Use of these tools in some experiments is shown.