Connectionist multi-sequence modelling and applications to multilingual neural machine translation

Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2017

Öğrenci: ORHAN FIRAT

Danışman: FATOŞ TUNAY YARMAN VURAL

Özet:

Deep (recurrent) neural networks has been shown to successfully learn complex mappings between arbitrary length input and output sequences, called sequence to sequence learning, within the effective framework of encoder-decoder networks. This thesis investigates the extensions of sequence to sequence models, to handle multiple sequences at the same time within a single parametric model, and proposes the ﬁrst large scale connectionist multi-sequence modeling approach. The proposed multisequence modeling architecture learns to map a set of input sequences into a set of output sequences thanks to the explicit and shared parametrization of a shared medium, interlingua. Proposedmulti-sequence modeling architectures applied to machine translation tasks, tackling the problem of multi-lingual neural machine translation (MLNMT). We explore applicability and the beneﬁts of MLNMT, (1) on large scale machine translation tasks, between ten pairs of languages within the same model, (2) low-resource language transfer problems, where the data between any given pair is scarce, and measuring the transfer learning capabilities, (3) multi-source translation tasks where we have multi-way parallel data available,leveraging complementary information between input sequences while mapping them into a single output sequence and ﬁnally (4) Zero-resource translation task, where we don’t have any available aligned data between a pair of source-target sequences.