Annohub – Annotation Metadata for Linked Data Applications
Frank Abromeit, Christian Fäth, Luis Glaser
Abstract
We introduce a new dataset for the Linguistic Linked Open Data (LLOD) cloud that will provide metadata about annotation and language information harvested from annotated language resources like corpora freely available on the internet. To our knowledge annotation metadata is not provided by any metadata provider, e.g. linghub, datahub or CLARIN so far. On the other hand, language metadata that is found on such portals is rarely provided in machine-readable form, especially as Linked Data. In this paper, we describe the harvesting process, content and structure of the new dataset and its application in the Lin|gu|is|tik portal, a research platform for linguists. Aside from that, we introduce tools for the conversion of XML encoded language resources to the CoNLL format. The generated RDF data as well as the XML-converter application are made public under an open license.- Anthology ID:
- 2020.ldl-1.6
- Volume:
- Proceedings of the 7th Workshop on Linked Data in Linguistics (LDL-2020)
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Venues:
- LDL | LREC | WS
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 36–44
- URL:
- https://www.aclweb.org/anthology/2020.ldl-1.6
- DOI:
- PDF:
- https://www.aclweb.org/anthology/2020.ldl-1.6.pdf
You can write comments here (and agree to place them under CC-by). They are not guaranteed to stay and there is no e-mail functionality.