A preprocessor for Turkish text analysis


Oflazer K., Cetinoglu O., Bilgin O., Say B.

COMPUTER AND INFORMATION SCIENCES - ISCIS 2004, PROCEEDINGS, cilt.3280, ss.761-770, 2004 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 3280
  • Basım Tarihi: 2004
  • Dergi Adı: COMPUTER AND INFORMATION SCIENCES - ISCIS 2004, PROCEEDINGS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.761-770
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

This paper describes a preprocessor for Turkish text that involves various stages of lexical, morphological and multi-word construct processor for preprocessing Turkish text for various language engineering applications. We present the architecture of the system with special emphasis on how various kinds of collocations and other similar multi-word constructs are handled and present an evaluation from a test corpus.