Agnieszka Mykowiecka
2020
Supporting terminology extraction with dependency parses
Malgorzata Marciniak
|
Piotr Rychlik
|
Agnieszka Mykowiecka
Proceedings of the 6th International Workshop on Computational Terminology
Terminology extraction procedure usually consists of selecting candidates for terms and ordering them according to their importance for the given text or set of texts. Depending on the method used, a list of candidates contains different fractions of grammatically incorrect, semantically odd and irrelevant sequences. The aim of this work was to improve term candidate selection by reducing the number of incorrect sequences using a dependency parser for Polish.
Are White Ravens Ever White? - Non-Literal Adjective-Noun Phrases in Polish
Agnieszka Mykowiecka
|
Malgorzata Marciniak
Proceedings of The 12th Language Resources and Evaluation Conference
In the paper we describe two resources of Polish data focused on literal and metaphorical meanings of adjective-noun phrases. The first one is FigAN and consists of isolated phrases which are divided into three types: phrases with only literal meaning, with only metaphorical meaning, and phrases which can be interpreted as literal or metaphorical ones depending on a context of use. The second data is the FigSen corpus which consists of 1833 short fragments of texts containing at least one phrase from the FigAN data which may have both meanings. The corpus is annotated in two ways. One approach concerns annotation of all adjective-noun phrases. In the second approach, literal or metaphorical senses are assigned to all adjectives and nouns in the data. The paper addresses statistics of data and compares two types of annotation. The corpora were used in experiments of automatic recognition of Polish non-literal adjective noun phrases.
Search