One-Size-Fits-All Multilingual Models

Ben Peters; André F. T. Martins

One-Size-Fits-All Multilingual Models

Abstract

This paper presents DeepSPIN’s submissions to Tasks 0 and 1 of the SIGMORPHON 2020 Shared Task. For both tasks, we present multilingual models, training jointly on data in all languages. We perform no language-specific hyperparameter tuning – each of our submissions uses the same model for all languages. Our basic architecture is the sparse sequence-to-sequence model with entmax attention and loss, which allows our models to learn sparse, local alignments while still being trainable with gradient-based techniques. For Task 1, we achieve strong performance with both RNN- and transformer-based sparse models. For Task 0, we extend our RNN-based model to a multi-encoder set-up in which separate modules encode the lemma and inflection sequences. Despite our models’ lack of language-specific tuning, they tie for first in Task 0 and place third in Task 1.

Anthology ID:: 2020.sigmorphon-1.4
Volume:: Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:: July
Year:: 2020
Address:: Online
Venues:: ACL | SIGMORPHON | WS
SIG:: SIGMORPHON
Publisher:: Association for Computational Linguistics
Note:
Pages:: 63–69
URL:: https://www.aclweb.org/anthology/2020.sigmorphon-1.4
DOI:
Bib Export formats:: BibTeX MODS XML EndNote
PDF:: https://www.aclweb.org/anthology/2020.sigmorphon-1.4.pdf

You can write comments here (and agree to place them under CC-by). They are not guaranteed to stay and there is no e-mail functionality.

PDF BibTeX Search