2024 IEEE International Conference on Consumer Electronics, ICCE 2024, Nevada, Amerika Birleşik Devletleri, 6 - 08 Ocak 2024
In this study, we demonstrate how state-of-the-art baseline image captioning methods overlook important details in the image and we analyze the reasoning behind this problem. We propose a novel approach, named RICOA (RIch Captioning with Object Attributes), which integrates object attributes to the generated captions. Our analyses demonstrate that the proposed approach generates richer and more visually grounded captions by integrating attributes of the objects in the scene to the generated captions successfully.