Jean Senellart


2020

pdf bib
Boosting Neural Machine Translation with Similar Translations
Jitao XU | Josep Crego | Jean Senellart
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper explores data augmentation methods for training Neural Machine Translation to make use of similar translations, in a comparable way a human translator employs fuzzy matches. In particular, we show how we can simply present the neural model with information of both source and target sides of the fuzzy matches, we also extend the similarity to include semantically related translations retrieved using sentence distributed representations. We show that translations based on fuzzy matching provide the model with “copy” information while translations based on embedding similarities tend to extend the translation “context”. Results indicate that the effect from both similar sentences are adding up to further boost accuracy, combine naturally with model fine-tuning and are providing dynamic adaptation for unseen translation pairs. Tests on multiple data sets and domains show consistent accuracy improvements. To foster research around these techniques, we also release an Open-Source toolkit with efficient and flexible fuzzy-match implementation.

pdf bib
Efficient and High-Quality Neural Machine Translation with OpenNMT
Guillaume Klein | Dakun Zhang | Clément Chouteau | Josep Crego | Jean Senellart
Proceedings of the Fourth Workshop on Neural Generation and Translation

This paper describes the OpenNMT submissions to the WNGT 2020 efficiency shared task. We explore training and acceleration of Transformer models with various sizes that are trained in a teacher-student setup. We also present a custom and optimized C++ inference engine that enables fast CPU and GPU decoding with few dependencies. By combining additional optimizations and parallelization techniques, we create small, efficient, and high-quality neural machine translation models.