Low-Resource G2P and P2G Conversion with Synthetic Training Data
Bradley Hauer, Amir Ahmad Habibi, Yixing Luan, Arnob Mallik, Grzegorz Kondrak
Abstract
This paper presents the University of Alberta systems and results in the SIGMORPHON 2020 Task 1: Multilingual Grapheme-to-Phoneme Conversion. Following previous SIGMORPHON shared tasks, we define a low-resource setting with 100 training instances. We experiment with three transduction approaches in both standard and low-resource settings, as well as on the related task of phoneme-to-grapheme conversion. We propose a method for synthesizing training data using a combination of diverse models.- Anthology ID:
- 2020.sigmorphon-1.12
- Volume:
- Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Venues:
- ACL | SIGMORPHON | WS
- SIG:
- SIGMORPHON
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 117–122
- URL:
- https://www.aclweb.org/anthology/2020.sigmorphon-1.12
- DOI:
- PDF:
- https://www.aclweb.org/anthology/2020.sigmorphon-1.12.pdf
You can write comments here (and agree to place them under CC-by). They are not guaranteed to stay and there is no e-mail functionality.