Towards Zero-shot Sign Language Recognition

Bilge, Yunus; CİNBİŞ, RAMAZAN; Ikizler-Cinbis, Nazli

doi:10.1109/tpami.2022.3143074

Towards Zero-shot Sign Language Recognition

Bilge Y. C., CİNBİŞ R. G., Ikizler-Cinbis N.

IEEE Transactions on Pattern Analysis and Machine Intelligence, cilt.45, sa.1, ss.1217-1232, 2023 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 45 Sayı: 1
Basım Tarihi: 2023
Doi Numarası: 10.1109/tpami.2022.3143074
Dergi Adı: IEEE Transactions on Pattern Analysis and Machine Intelligence
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, zbMATH, Civil Engineering Abstracts
Sayfa Sayıları: ss.1217-1232
Anahtar Kelimeler: Sign language recognition, zero-shot learning, CLASSIFICATION, SYSTEMS
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

IEEEThis paper tackles the problem of zero-shot sign language recognition (ZSSLR), where the goal is to leverage models learned over the seen sign classes to recognize the instances of unseen sign classes. In this context, readily available textual sign descriptions and attributes collected from sign language dictionaries are utilized as semantic class representations for knowledge transfer. For this novel problem setup, we introduce three benchmark datasets with their accompanying textual and attribute descriptions to analyze the problem in detail. Our proposed approach builds spatiotemporal models of body and hand regions. By leveraging the descriptive text and attribute embeddings along with these visual representations within a zero-shot learning framework, we show that textual and attribute based class definitions can provide effective knowledge for the recognition of previously unseen sign classes. We additionally introduce techniques to analyze the influence of binary attributes in correct and incorrect zero-shot predictions. We anticipate that the introduced approaches and the accompanying datasets will provide a basis for further exploration of zero-shot learning in sign language recognition.