Neural Machine Translation (NMT) has recently gained substantial popularity not only in academia, but also in industry. In the present work, we compare the quality of Phrase-Based Statistical Machine Translation (PBSMT) and NMT solutions of a commercial platform for Custom Machine Translation (CMT) that are tailored to accommodate large-scale translation production. In a large-scale translation production line, there is a limited amount of time to train an end-to-end system (NMT or PBSMT). Our work focuses on the comparison between NMT systems trained under a time restriction of 4 days and PBSMT systems. To train both NMT and PBSMT engines for each language pair, we strictly use the same parallel corpora and show that, even if trained within this time limit, NMT quality surpasses substantially that of PBSMT. Furthermore, we challenge the reliability of automatic quality evaluation metrics (in particular, BLEU) for NMT quality evaluation. We support our hypothesis with both analytical and empirical evidence.
|Number of pages||6|
|Publication status||Published - 2017|
|Event||20th Annual Conference of the European Association for Machine Translation, EAMT 2017 - Prague, Czech Republic|
Duration: 29 May 2017 → 31 May 2017
|Conference||20th Annual Conference of the European Association for Machine Translation, EAMT 2017|
|Period||29/05/17 → 31/05/17|