Empirical evaluation of NMT and PBSMT quality for large-scale translation production

Dimitar Shterionov, Pat Nagle, Laura Casanellas, Riccardo Superbo, Tony O'Dowd

Research output: Contribution to conferencePaperOther research output

16 Citations (Scopus)

Abstract

Neural Machine Translation (NMT) has recently gained substantial popularity not only in academia, but also in industry. In the present work, we compare the quality of Phrase-Based Statistical Machine Translation (PBSMT) and NMT solutions of a commercial platform for Custom Machine Translation (CMT) that are tailored to accommodate large-scale translation production. In a large-scale translation production line, there is a limited amount of time to train an end-to-end system (NMT or PBSMT). Our work focuses on the comparison between NMT systems trained under a time restriction of 4 days and PBSMT systems. To train both NMT and PBSMT engines for each language pair, we strictly use the same parallel corpora and show that, even if trained within this time limit, NMT quality surpasses substantially that of PBSMT. Furthermore, we challenge the reliability of automatic quality evaluation metrics (in particular, BLEU) for NMT quality evaluation. We support our hypothesis with both analytical and empirical evidence.

Original languageEnglish
Pages74-79
Number of pages6
Publication statusPublished - 2017
Externally publishedYes
Event20th Annual Conference of the European Association for Machine Translation, EAMT 2017 - Prague, Czech Republic
Duration: 29 May 201731 May 2017

Conference

Conference20th Annual Conference of the European Association for Machine Translation, EAMT 2017
Country/TerritoryCzech Republic
CityPrague
Period29/05/1731/05/17

Fingerprint

Dive into the research topics of 'Empirical evaluation of NMT and PBSMT quality for large-scale translation production'. Together they form a unique fingerprint.

Cite this