Neural Machine Translation (NMT) is a recently-emerged paradigm for Machine Translation (MT) that has shown promising results as well as a great potential to solve challenging MT tasks. One such a task is how to provide good MT for languages with sparse training data. In this paper we investigate a Zero Shot Translation (ZST) approach for such language combinations. ZST is a multilingual translation mechanism which uses a single NMT engine to translate between multiple languages, even such languages for which no direct parallel data was provided during training. After assessing ZST feasibility, by training a proof-of-concept engine ZST on French↔English and Italian↔English data, we focus on languages with sparse training data. In particular, we address the Tamil↔Hindi language pair. Our analysis shows the potential and effectiveness of ZST in such scenarios. To train and translate with ZST engines, we extend the training and translation pipelines of a commercial MT provider-KantanMT-with ZST capabilities, making this technology available to all users of the platform.
|Title of host publication||Proceedings of the MT Summit|
|Publication status||Published - 18 Sept 2017|