Divide and conquer strategy for large data MT

Dimitar Shterionov*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

In recent years Statistical Machine Translation (SMT) has established a dominant position among the variety of machine translation paradigms. Industrial Machine Translation computer systems, such as KantanMT, deliver fast and of high performance SMT solutions to the end user. KantanMT is a cloud-based platform that allows its users to build custom SMT engines and use them for translation via a batch or an online mode. In order to employ the full potential of the cloud we have developed an efficient method for asynchronous online translation. This method implements a producer-consumer technique that uses multiple queues as intermediate data storage units. Furthermore, each queue is associated with a priority that defines how quickly the queue can be consumed. That gives our users the control on the flow of translation requests, especially when it comes to large amounts of data. In this paper we describe the design and the implementation of the new method and compare it to others. We then assess the improvement in the quality of service of our platform by empirical evaluation.

Original languageEnglish
Title of host publicationMT Users' Track
EditorsOlga Beregovaya, Jennifer Doyon, Lucie Langlois, Steve Richardson
PublisherAssociation for Machine Translation in the Americas
Pages114-122
Number of pages9
ISBN (Electronic)9780000000002
Publication statusPublished - 2016
Externally publishedYes
Event12th Conference of the Association for Machine Translation in the Americas, AMTA 2016 - Austin, United States
Duration: 28 Oct 20161 Nov 2016

Publication series

NameProceedings - AMTA 2016: 12th Conference of the Association for Machine Translation in the Americas
Volume2

Conference

Conference12th Conference of the Association for Machine Translation in the Americas, AMTA 2016
CountryUnited States
CityAustin
Period28/10/161/11/16

Fingerprint Dive into the research topics of 'Divide and conquer strategy for large data MT'. Together they form a unique fingerprint.

  • Cite this

    Shterionov, D. (2016). Divide and conquer strategy for large data MT. In O. Beregovaya, J. Doyon, L. Langlois, & S. Richardson (Eds.), MT Users' Track (pp. 114-122). (Proceedings - AMTA 2016: 12th Conference of the Association for Machine Translation in the Americas; Vol. 2). Association for Machine Translation in the Americas.