DATA-DRIVEN IMAGE CAPTIONING WITH META-CLASS BASED RETRIEVAL

Kilickaya M., Erdem E., Erdem A., İKİZLER CİNBİŞ N., Cakici R.

22nd IEEE Signal Processing and Communications Applications Conference (SIU), Trabzon, Türkiye, 23 - 25 Nisan 2014, ss.1922-1925, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: Trabzon
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.1922-1925
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Automatic image captioning, the process cif producing a description for an image, is a very challenging problem which has only recently received interest from the computer vision and natural language processing communities. In this study, we present a novel data-driven image captioning strategy, which, for a given image, finds the most visually similar image in a large dataset of image-caption pairs and transfers its caption as the description of the input image. Our novelty lies in employing a recently' proposed high-level global image representation, named the meta-class descriptor, to better capture the semantic content of the input image for use in the retrieval process. Our experiments show that as compared to the baseline Im2Text model, our meta-class guided approach produces more accurate descriptions.