Employing Named Entities for Semantic Retrieval of News Videos in Turkish


Kucuk D., YAZICI A.

24th International Symposium on Computer and Information Sciences, Güzelyurt, Cyprus (Kktc), 14 - 16 September 2009, pp.153-154 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/iscis.2009.5291836
  • City: Güzelyurt
  • Country: Cyprus (Kktc)
  • Page Numbers: pp.153-154

Abstract

Named entities are known to be important means for semantic annotation of news texts. Considerable work has been carried out for semantic indexing of both textual news and news videos especially in English through the employment of named entities extracted from textual news or transcriptions of the news videos. In this paper, we present our semantic retrieval architecture for news videos in Turkish based on prior semantic annotation of the videos with the corresponding named entities in the news transcription texts. We employ a rule-based named entity recognizer for Turkish which makes use of handcrafted sets of lexical resources and pattern bases. We compiled a small corpus of Turkish news videos and the named entity recognizer in its current form achieves a success rate of about 75% on this corpus. A retrieval interface is implemented to access the video corpus through the boolean queries formed with the extracted named entities. The interface currently does not involve any ranking procedure, displaying all the videos, the transcription texts of which satisfy the boolean query posed through the interface, sorted by their broadcast date. The presented study is significant for its being the first study to perform automatic semantic video annotation on a genuine news video corpus in Turkish and demonstrating the utilization of the annotations through a retrieval interface.