Questions in the TED-Multilingual Discourse Bank and the development of an annotation scheme


ZEYREK BOZŞAHİN D., Mendes A.

Linguistics Vanguard, 2025 (AHCI, SSCI, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1515/lingvan-2025-0035
  • Dergi Adı: Linguistics Vanguard
  • Derginin Tarandığı İndeksler: Arts and Humanities Citation Index (AHCI), Social Sciences Citation Index (SSCI), Scopus, Linguistic Bibliography, MLA - Modern Language Association Database
  • Anahtar Kelimeler: annotation, contrastive linguistics, monologic discourse, question-Answer pairs, TED Talks
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In this paper, we analyze question-Answer pairs and stand-Alone questions within the spoken discourse of TED Talks, specifically focusing on the TED-Multilingual Discourse Bank. Our aim is to reveal various characteristics of questions through an annotation approach. We have developed a taxonomy, referred to as the TAQ-TED, to categorize types of questions and examine their information transfer and dialogue control functions, drawing on the Dynamic Interpretation Theory++ taxonomy of dialogue acts and their attribution in accordance with the Penn Discourse Treebank annotation guidelines. We outline the taxonomy, present our annotation results, and provide a preliminary cross-linguistic analysis comparing English questions with their Turkish and Portuguese translations. The TAQ-TED represents a promising initial framework for annotating questions in monologic discourse across multiple languages.