Kyunghyun Cho
2020
Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training
Margaret Li
|
Stephen Roller
|
Ilia Kulikov
|
Sean Welleck
|
Y-Lan Boureau
|
Kyunghyun Cho
|
Jason Weston
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Generative dialogue models currently suffer from a number of problems which standard maximum likelihood training does not address. They tend to produce generations that (i) rely too much on copying from the context, (ii) contain repetitions within utterances, (iii) overuse frequent words, and (iv) at a deeper level, contain logical flaws.In this work we show how all of these problems can be addressed by extending the recently introduced unlikelihood loss (Welleck et al., 2019) to these cases. We show that appropriate loss functions which regularize generated outputs to match human distributions are effective for the first three issues. For the last important general issue, we show applying unlikelihood to collected data of what a model should not do is effective for improving logical consistency, potentially paving the way to generative models with greater reasoning ability. We demonstrate the efficacy of our approach across several dialogue tasks.
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
Alex Wang
|
Kyunghyun Cho
|
Mike Lewis
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Practical applications of abstractive summarization models are limited by frequent factual inconsistencies with respect to their input. Existing automatic evaluation metrics for summarization are largely insensitive to such errors. We propose QAGS (pronounced “kags”), an automatic evaluation protocol that is designed to identify factual inconsistencies in a generated summary. QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source. To evaluate QAGS, we collect human judgments of factual consistency on model-generated summaries for the CNN/DailyMail (Hermann et al., 2015) and XSUM (Narayan et al., 2018) summarization datasets. QAGS has substantially higher correlations with these judgments than other automatic evaluation metrics. Also, QAGS offers a natural form of interpretability: The answers and questions generated while computing QAGS indicate which tokens of a summary are inconsistent and why. We believe QAGS is a promising tool in automatically generating usable and factually consistent text. Code for QAGS will be available at https://github.com/W4ngatang/qags.
Compositionality and Capacity in Emergent Languages
Abhinav Gupta
|
Cinjon Resnick
|
Jakob Foerster
|
Andrew Dai
|
Kyunghyun Cho
Proceedings of the 5th Workshop on Representation Learning for NLP
Recent works have discussed the extent to which emergent languages can exhibit properties of natural languages particularly learning compositionality. In this paper, we investigate the learning biases that affect the efficacy and compositionality in multi-agent communication in addition to the communicative bandwidth. Our foremost contribution is to explore how the capacity of a neural network impacts its ability to learn a compositional language. We additionally introduce a set of evaluation metrics with which we analyze the learned languages. Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization. While we empirically see evidence for the bottom of this range, we curiously do not find evidence for the top part of the range and believe that this is an open question for the community.
Search
Co-authors
- Margaret Li 1
- Stephen Roller 1
- Ilia Kulikov 1
- Sean Welleck 1
- Y-Lan Boureau 1
- show all...