Daniela Gerz
2020
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
Samuel Coope
|
Tyler Farghly
|
Daniela Gerz
|
Ivan Vulić
|
Matthew Henderson
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task. This formulation allows for a simple integration of conversational knowledge coded in large pretrained conversational models such as ConveRT (Henderson et al., 2019). We show that leveraging such knowledge in Span-ConveRT is especially useful for few-shot learning scenarios: we report consistent gains over 1) a span extractor that trains representations from scratch in the target domain, and 2) a BERT-based span extractor. In order to inspire more work on span extraction for the slot-filling task, we also release RESTAURANTS-8K, a new challenging data set of 8,198 utterances, compiled from actual conversations in the restaurant booking domain.
Multidirectional Associative Optimization of Function-Specific Word Representations
Daniela Gerz
|
Ivan Vulić
|
Marek Rei
|
Roi Reichart
|
Anna Korhonen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures. Our model induces a joint function-specific word vector space, where vectors of e.g. plausible SVO compositions lie close together. The model retains information about word group membership even in the joint space, and can thereby effectively be applied to a number of tasks reasoning over the SVO structure. We show the robustness and versatility of the proposed framework by reporting state-of-the-art results on the tasks of estimating selectional preference and event similarity. The results indicate that the combinations of representations learned with our task-independent model outperform task-specific architectures from prior work, while reducing the number of parameters by up to 95%.
Efficient Intent Detection with Dual Sentence Encoders
Iñigo Casanueva
|
Tadas Temčinas
|
Daniela Gerz
|
Matthew Henderson
|
Ivan Vulić
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI
Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i.e., in few-shot setups). Motivated by these requirements, we introduce intent detection methods backed by pretrained dual sentence encoders such as USE and ConveRT. We demonstrate the usefulness and wide applicability of the proposed intent detectors, showing that: 1) they outperform intent detectors based on fine-tuning the full BERT-Large model or using BERT as a fixed black-box encoder on three diverse intent detection data sets; 2) the gains are especially pronounced in few-shot setups (i.e., with only 10 or 30 annotated examples per intent); 3) our intent detectors can be trained in a matter of minutes on a single CPU; and 4) they are stable across different hyperparameter settings. In hope of facilitating and democratizing research focused on intention detection, we release our code, as well as a new challenging single-domain intent detection dataset comprising 13,083 annotated examples over 77 intents.
Search
Co-authors
- Ivan Vulić 3
- Matthew Henderson 2
- Samuel Coope 1
- Tyler Farghly 1
- Marek Rei 1
- show all...