Adding a Syntactic Annotation Level to the Corpus of Contemporary Romanian Language
Andrei Scutelnicu, Catalina Maranduc, Dan Cristea
Abstract
In this paper we present an experiment of augmenting the Corpus of Contemporary Romanian Language (CoRoLa) with the syntactic level of annotations, which would allow users to address queries about the syntax of Romanian sentences, in the Universal Dependency model. After a short introduction of CoRoLa, we describe the treebanks used to train the dependency parser, we show the evaluation results and the process of upgrading CoRoLa with the new level of annotations. The parser displaying the best accuracy with respect to recognition of heads and relations, out of three variants trained on manually built treebanks, was chosen. Keywords: Syntactic annotation, treebank, corpus, maltparser- Anthology ID:
- 2020.cmlc-1.9
- Volume:
- Proceedings of the 8th Workshop on Challenges in the Management of Large Corpora
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Venues:
- CMLC | LREC | WS
- SIG:
- Publisher:
- European Language Ressources Association
- Note:
- Pages:
- 58–62
- URL:
- https://www.aclweb.org/anthology/2020.cmlc-1.9
- DOI:
- PDF:
- https://www.aclweb.org/anthology/2020.cmlc-1.9.pdf
You can write comments here (and agree to place them under CC-by). They are not guaranteed to stay and there is no e-mail functionality.