Event Extraction from Turkish Football Web-casting Texts Using Hand-crafted Templates


Tunaoglu D., Alan O., Sabuncu O., Akpinar S., Cicekli N. K., ALPASLAN F. N.

3rd International Conference on Semantic Computing (ICSC 2009), California, Amerika Birleşik Devletleri, 14 - 16 Eylül 2009, ss.466-467 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/icsc.2009.16
  • Basıldığı Şehir: California
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.466-467
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In this paper, we present a domain specific information extraction approach We use manually formed templates to extract information from unstructured documents where grammatical and syntactical errors occur frequently We applied our approach to primarily Turkish unstructured soccer web-casting texts Compared to automated approaches we achieve high precision-recall rates (97% - 85%). In addition to that, unlike automated approaches we do not use part-of-speech taggers, parsers, phrase chunkers or that kind of a linguistic tool. As a result, our approach can be applied to any domain or any language without the necessity of successful linguistic tools. The drawback of our approach is the time spent on crafting the templates. We also propose the means to decrease that time.