Turkish Resources for Visual Word Recognition


Erten B., BOZŞAHİN H. C., ZEYREK BOZŞAHİN D.

9th International Conference on Language Resources and Evaluation (LREC), Reykjavik, İzlanda, 26 - 31 Mayıs 2014, ss.2106-2110 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Reykjavik
  • Basıldığı Ülke: İzlanda
  • Sayfa Sayıları: ss.2106-2110
  • Anahtar Kelimeler: morphology, lexical statistics, pseudoword generation, PROGRAM
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

We report two tools to conduct psycholinguistic experiments on Turkish words. KelimetriK allows experimenters to choose words based on desired orthographic scores of word frequency, bigram and trigram frequency, ON, OLD20, ATL and subset/superset similarity. Turkish version of Wuggy generates pseudowords from one or more template words using an efficient method. The syllabified version of the words are used as the input, which are decomposed into their sub-syllabic components. The bigram frequency chains are constructed by the entire words' onset, nucleus and coda patterns. Lexical statistics of stems and their syllabification are compiled by us from BOUN corpus of 490 million words. Use of these tools in some experiments is shown.