Jakub Waszczuk


2020

pdf bib
Contemplata, a Free Platform for Constituency Treebank Annotation
Jakub Waszczuk | Ilaine Wang | Jean-Yves Antoine | Anaïs Halftermeyer
Proceedings of The 12th Language Resources and Evaluation Conference

This paper describes Contemplata, an annotation platform that offers a generic solution for treebank building as well as treebank enrichment with relations between syntactic nodes. Contemplata is dedicated to the annotation of constituency trees. The framework includes support for syntactic parsers, which provide automatic annotations to be manually revised. The balanced strategy of annotation between automatic parsing and manual revision allows to reduce the annotator workload, which favours data reliability. The paper presents the software architecture of Contemplata, describes its practical use and eventually gives two examples of annotation projects that were conducted on the platform.

pdf bib
Supervised Disambiguation of German Verbal Idioms with a BiLSTM Architecture
Rafael Ehren | Timm Lichte | Laura Kallmeyer | Jakub Waszczuk
Proceedings of the Second Workshop on Figurative Language Processing

Supervised disambiguation of verbal idioms (VID) poses special demands on the quality and quantity of the annotated data used for learning and evaluation. In this paper, we present a new VID corpus for German and perform a series of VID disambiguation experiments on it. Our best classifier, based on a neural architecture, yields an error reduction across VIDs of 57% in terms of accuracy compared to a simple majority baseline.