Usage disambiguation of Turkish discourse connectives


Başıbüyük K., ZEYREK BOZŞAHİN D.

Language Resources and Evaluation, vol.57, no.1, pp.223-256, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 57 Issue: 1
  • Publication Date: 2023
  • Doi Number: 10.1007/s10579-022-09614-3
  • Journal Name: Language Resources and Evaluation
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, FRANCIS, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, EBSCO Education Source, Educational research abstracts (ERA), Humanities Abstracts, INSPEC, Linguistic Bibliography, Linguistics & Language Behavior Abstracts, Metadex, MLA - Modern Language Association Database, Civil Engineering Abstracts
  • Page Numbers: pp.223-256
  • Keywords: Discourse processing, Discourse connectives, Usage disambiguation, Connective lexicon, Turkish, LANGUAGE
  • Middle East Technical University Affiliated: Yes

Abstract

© 2023, The Author(s), under exclusive licence to Springer Nature B.V.This paper describes a rule-based approach and a machine learning approach to disambiguate the discourse usage of Turkish connectives, which not only has single and phrasal connectives as most languages do, but also suffixal connectives that largely correspond to subordinating conjunctions in English. Since these connectives have different linguistic characteristics, two sets of linguistic rules are devised to disambiguate their discourse usage. The linguistic rules are used in the rule-based approach and employed as feature sets in the machine learning approach to test whether they influenced the decision of our algorithms. The results of both approaches are evaluated over the Turkish section of TED-Multilingual Discourse Bank and Turkish Discourse Bank 1.1, two datasets annotated in the Penn Discourse TreeBank style. The paper attests to the predictive power of the linguistic rules in disambiguating the discourse usage of both types of connectives also offering new knowledge and insights for discourse processing from the view of a morphologically rich language.