TY - GEN
T1 - Quality Estimation-Assisted Automatic Post-Editing
AU - Deoghare, Sourabh
AU - Kanojia, Diptesh
AU - Ranasinghe, Tharindu
AU - Blain, Frédéric
AU - Bhattacharyya, Pushpak
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Automatic Post-Editing (APE) systems are prone to over-correction of the Machine Translation (MT) outputs. While a Word-level Quality Estimation (QE) system can provide a way to curtail the over-correction, a significant performance gain has not been observed thus far by utilizing existing APE and QE combination strategies. This paper proposes joint training of a model over QE (sentence- and word-level) and APE tasks to improve the APE. Our proposed approach utilizes a multi-task learning (MTL) methodology, which shows significant improvement while treating the tasks as a 'bargaining game' during training. Moreover, we investigate various existing combination strategies and show that our approach achieves state-of-the-art performance for a 'distant' language pair, viz., English-Marathi. We observe an improvement of 1.09 TER and 1.37 BLEU points over a baseline QE-Unassisted APE system for English-Marathi while also observing 0.46 TER and 0.62 BLEU points improvement for English-German. Further, we discuss the results qualitatively and show how our approach helps reduce over-correction, thereby improving the APE performance. We also observe that the degree of integration between QE and APE directly correlates with the APE performance gain. We release our code publicly.
AB - Automatic Post-Editing (APE) systems are prone to over-correction of the Machine Translation (MT) outputs. While a Word-level Quality Estimation (QE) system can provide a way to curtail the over-correction, a significant performance gain has not been observed thus far by utilizing existing APE and QE combination strategies. This paper proposes joint training of a model over QE (sentence- and word-level) and APE tasks to improve the APE. Our proposed approach utilizes a multi-task learning (MTL) methodology, which shows significant improvement while treating the tasks as a 'bargaining game' during training. Moreover, we investigate various existing combination strategies and show that our approach achieves state-of-the-art performance for a 'distant' language pair, viz., English-Marathi. We observe an improvement of 1.09 TER and 1.37 BLEU points over a baseline QE-Unassisted APE system for English-Marathi while also observing 0.46 TER and 0.62 BLEU points improvement for English-German. Further, we discuss the results qualitatively and show how our approach helps reduce over-correction, thereby improving the APE performance. We also observe that the degree of integration between QE and APE directly correlates with the APE performance gain. We release our code publicly.
UR - http://www.scopus.com/inward/record.url?scp=85183308519&partnerID=8YFLogxK
U2 - 0.18653/v1/2023.findings-emnlp.115
DO - 0.18653/v1/2023.findings-emnlp.115
M3 - Conference contribution
AN - SCOPUS:85183308519
T3 - Findings of the Association for Computational Linguistics: EMNLP 2023
SP - 1686
EP - 1698
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Y2 - 6 December 2023 through 10 December 2023
ER -