Towards Zero-shot Sign Language Recognition


Creative Commons License

Bilge Y. C., CİNBİŞ R. G., Ikizler-Cinbis N.

IEEE Transactions on Pattern Analysis and Machine Intelligence, cilt.45, sa.1, ss.1217-1232, 2023 (SCI-Expanded) identifier identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 45 Sayı: 1
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1109/tpami.2022.3143074
  • Dergi Adı: IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, zbMATH, Civil Engineering Abstracts
  • Sayfa Sayıları: ss.1217-1232
  • Anahtar Kelimeler: Sign language recognition, zero-shot learning, CLASSIFICATION, SYSTEMS
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

IEEEThis paper tackles the problem of zero-shot sign language recognition (ZSSLR), where the goal is to leverage models learned over the seen sign classes to recognize the instances of unseen sign classes. In this context, readily available textual sign descriptions and attributes collected from sign language dictionaries are utilized as semantic class representations for knowledge transfer. For this novel problem setup, we introduce three benchmark datasets with their accompanying textual and attribute descriptions to analyze the problem in detail. Our proposed approach builds spatiotemporal models of body and hand regions. By leveraging the descriptive text and attribute embeddings along with these visual representations within a zero-shot learning framework, we show that textual and attribute based class definitions can provide effective knowledge for the recognition of previously unseen sign classes. We additionally introduce techniques to analyze the influence of binary attributes in correct and incorrect zero-shot predictions. We anticipate that the introduced approaches and the accompanying datasets will provide a basis for further exploration of zero-shot learning in sign language recognition.