2020
pdf
bib
abs
To compress or not to compress? A Finite-State approach to Nen verbal morphology
Saliha Muradoglu
|
Nicholas Evans
|
Hanna Suominen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
This paper describes the development of a verbal morphological parser for an under-resourced Papuan language, Nen. Nen verbal morphology is particularly complex, with a transitive verb taking up to 1,740 unique features. The structural properties exhibited by Nen verbs raises interesting choices for analysis. Here we compare two possible methods of analysis: ‘Chunking’ and decomposition. ‘Chunking’ refers to the concept of collating morphological segments into one, whereas the decomposition model follows a more classical linguistic approach. Both models are built using the Finite-State Transducer toolkit foma. The resultant architecture shows differences in size and structural clarity. While the ‘Chunking’ model is under half the size of the full de-composed counterpart, the decomposition displays higher structural order. In this paper, we describe the challenges encountered when modelling a language exhibiting distributed exponence and present the first morphological analyser for Nen, with an overall accuracy of 80.3%.
pdf
bib
abs
Linguist vs. Machine: Rapid Development of Finite-State Morphological Grammars
Sarah Beemer
|
Zak Boston
|
April Bukoski
|
Daniel Chen
|
Princess Dickens
|
Andrew Gerlach
|
Torin Hopkins
|
Parth Anand Jawale
|
Chris Koski
|
Akanksha Malhotra
|
Piyush Mishra
|
Saliha Muradoglu
|
Lan Sang
|
Tyler Short
|
Sagarika Shreevastava
|
Elizabeth Spaulding
|
Testumichi Umada
|
Beilei Xiang
|
Changbing Yang
|
Mans Hulden
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Sequence-to-sequence models have proven to be highly successful in learning morphological inflection from examples as the series of SIGMORPHON/CoNLL shared tasks have shown. It is usually assumed, however, that a linguist working with inflectional examples could in principle develop a gold standard-level morphological analyzer and generator that would surpass a trained neural network model in accuracy of predictions, but that it may require significant amounts of human labor. In this paper, we discuss an experiment where a group of people with some linguistic training develop 25+ grammars as part of the shared task and weigh the cost/benefit ratio of developing grammars by hand. We also present tools that can help linguists triage difficult complex morphophonological phenomena within a language and hypothesize inflectional class membership. We conclude that a significant development effort by trained linguists to analyze and model morphophonological patterns are required in order to surpass the accuracy of neural models.