On integrating a language model into neural machine translation

Gulcehre, Caglar; Firat, Orhan; Xu, Kelvin; Cho, Kyunghyun; Bengio, Yoshua

doi:10.1016/j.csl.2017.01.014

On integrating a language model into neural machine translation

Gulcehre C., Firat O., Xu K., Cho K., Bengio Y.

COMPUTER SPEECH AND LANGUAGE, cilt.45, ss.137-148, 2017 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 45
Basım Tarihi: 2017
Doi Numarası: 10.1016/j.csl.2017.01.014
Dergi Adı: COMPUTER SPEECH AND LANGUAGE
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.137-148
Anahtar Kelimeler: Neural machine translation, Monolingual data, Language models, Low resource machine translation, Deep learning, Neural network
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Recent advances in end-to-end neural machine translation models have achieved promising results on high-resource language pairs such as En -> Fr and En -> De. One of the major factor behind these successes is the availability of high quality parallel corpora. We explore two strategies on leveraging abundant amount of monolingual data for neural machine translation. We observe improvements by both combining scores from neural language model trained only on target monolingual data with neural machine translation model and fusing hidden-states of these two models. We obtain up to 2 BLEU improvement over hierarchical and phrase-based baseline on low-resource language pair, Turkish -> English. Our method was initially motivated towards tasks with less parallel data, but we also show that it extends to high resource languages such as Cs -> En and De -> En translation tasks, where we obtain 0.39 and 0.47 BLEU improvements over the neural machine translation baselines, respectively. (C) 2017 Elsevier Ltd. All rights reserved.