Employing Named Entities for Semantic Retrieval of News Videos in Turkish


Kucuk D., YAZICI A.

24th International Symposium on Computer and Information Sciences, Güzelyurt, Kıbrıs (Kktc), 14 - 16 Eylül 2009, ss.153-154 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/iscis.2009.5291836
  • Basıldığı Şehir: Güzelyurt
  • Basıldığı Ülke: Kıbrıs (Kktc)
  • Sayfa Sayıları: ss.153-154
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Named entities are known to be important means for semantic annotation of news texts. Considerable work has been carried out for semantic indexing of both textual news and news videos especially in English through the employment of named entities extracted from textual news or transcriptions of the news videos. In this paper, we present our semantic retrieval architecture for news videos in Turkish based on prior semantic annotation of the videos with the corresponding named entities in the news transcription texts. We employ a rule-based named entity recognizer for Turkish which makes use of handcrafted sets of lexical resources and pattern bases. We compiled a small corpus of Turkish news videos and the named entity recognizer in its current form achieves a success rate of about 75% on this corpus. A retrieval interface is implemented to access the video corpus through the boolean queries formed with the extracted named entities. The interface currently does not involve any ranking procedure, displaying all the videos, the transcription texts of which satisfy the boolean query posed through the interface, sorted by their broadcast date. The presented study is significant for its being the first study to perform automatic semantic video annotation on a genuine news video corpus in Turkish and demonstrating the utilization of the annotations through a retrieval interface.