Margot Mieskes
2020
Reviewing Natural Language Processing Research
Kevin Cohen
|
Karën Fort
|
Margot Mieskes
|
Aurélie Névéol
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
This tutorial will cover the theory and practice of reviewing research in natural language processing. Heavy reviewing burdens on natural language processing researchers have made it clear that our community needs to increase the size of our pool of potential reviewers. Simultaneously, notable “false negatives”---rejection by our conferences of work that was later shown to be tremendously important after acceptance by other conferences—have raised awareness of the fact that our reviewing practices leave something to be desired. We do not often talk about “false positives” with respect to conference papers, but leaders in the field have noted that we seem to have a publication bias towards papers that report high performance, with perhaps not much else of interest in them. It need not be this way. Reviewing is a learnable skill, and you will learn it here via lectures and a considerable amount of hands-on practice.
Language Agnostic Automatic Summarization Evaluation
Christopher Tauchmann
|
Margot Mieskes
Proceedings of The 12th Language Resources and Evaluation Conference
So far work on automatic summarization has dealt primarily with English data. Accordingly, evaluation methods were primarily developed with this language in mind. In our work, we present experiments of adapting available evaluation methods such as ROUGE and PYRAMID to non-English data. We base our experiments on various English and non-English homogeneous benchmark data sets as well as a non-English heterogeneous data set. Our results indicate that ROUGE can indeed be adapted to non-English data – both homogeneous and heterogeneous. Using a recent implementation of performing an automatic PYRAMID evaluation, we also show its adaptability to non-English data.
A Data Set for the Analysis of Text Quality Dimensions in Summarization Evaluation
Margot Mieskes
|
Eneldo Loza Mencía
|
Tim Kronsbein
Proceedings of The 12th Language Resources and Evaluation Conference
Automatic evaluation of summarization focuses on developing a metric to represent the quality of the resulting text. However, text qualityis represented in a variety of dimensions ranging from grammaticality to readability and coherence. In our work, we analyze the depen-dencies between a variety of quality dimensions on automatically created multi-document summaries and which dimensions automaticevaluation metrics such as ROUGE, PEAK or JSD are able to capture. Our results indicate that variants of ROUGE are correlated tovarious quality dimensions and that some automatic summarization methods achieve higher quality summaries than others with respectto individual summary quality dimensions. Our results also indicate that differentiating between quality dimensions facilitates inspectionand fine-grained comparison of summarization methods and its characteristics. We make the data from our two summarization qualityevaluation experiments publicly available in order to facilitate the future development of specialized automatic evaluation methods.
Search
Co-authors
- K. Bretonnel Cohen 1
- Karën Fort 1
- Aurelie Neveol 1
- Christopher Tauchmann 1
- Eneldo Loza Mencía 1
- show all...