Towards a Swedish Roget-Style Thesaurus for NLP

Niklas Zechner, Lars Borin


Abstract
Bring’s thesaurus (Bring) is a Swedish counterpart of Roget, and its digitized version could make a valuable language resource for use in many and diverse natural language processing (NLP) applications. From the literature we know that Roget-style thesauruses and wordnets have complementary strengths in this context, so both kinds of lexical-semantic resource are good to have. However, Bring was published in 1930, and its lexical items are in the form of lemma–POS pairings. In order to be useful in our NLP systems, polysemous lexical items need to be disambiguated, and a large amount of modern vocabulary must be added in the proper places in Bring. The work presented here describes experiments aiming at automating these two tasks, at least in part, where we use the structure of an existing Swedish semantic lexicon – Saldo – both for disambiguation of ambiguous Bring entries and for addition of new entries to Bring.
Anthology ID:
2020.globalex-1.9
Volume:
Proceedings of the 2020 Globalex Workshop on Linked Lexicography
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
GLOBALEX | LREC | WS
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
53–60
URL:
https://www.aclweb.org/anthology/2020.globalex-1.9
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://www.aclweb.org/anthology/2020.globalex-1.9.pdf

You can write comments here (and agree to place them under CC-by). They are not guaranteed to stay and there is no e-mail functionality.