Clare Voss


2020

pdf bib
GAIA: A Fine-grained Multimedia Knowledge Extraction System
Manling Li | Alireza Zareian | Ying Lin | Xiaoman Pan | Spencer Whitehead | Brian Chen | Bo Wu | Heng Ji | Shih-Fu Chang | Clare Voss | Daniel Napierski | Marjorie Freedman
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present the first comprehensive, open source multimedia knowledge extraction system that takes a massive stream of unstructured, heterogeneous multimedia data from various sources and languages as input, and creates a coherent, structured knowledge base, indexing entities, relations, and events, following a rich, fine-grained ontology. Our system, GAIA, enables seamless search of complex graph queries, and retrieves multimedia evidence including text, images and videos. GAIA achieves top performance at the recent NIST TAC SM-KBP2019 evaluation. The system is publicly available at GitHub and DockerHub, with a narrated video that documents the system.

pdf bib
Dialogue-AMR: Abstract Meaning Representation for Dialogue
Claire Bonial | Lucia Donatelli | Mitchell Abrams | Stephanie M. Lukin | Stephen Tratz | Matthew Marge | Ron Artstein | David Traum | Clare Voss
Proceedings of The 12th Language Resources and Evaluation Conference

This paper describes a schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems. AMR offers a valuable level of abstraction of the propositional content of an utterance; however, it does not capture the illocutionary force or speaker’s intended contribution in the broader dialogue context (e.g., make a request or ask a question), nor does it capture tense or aspect. We explore dialogue in the domain of human-robot interaction, where a conversational robot is engaged in search and navigation tasks with a human partner. To address the limitations of standard AMR, we develop an inventory of speech acts suitable for our domain, and present “Dialogue-AMR”, an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect. To showcase the coverage of the schema, we use both manual and automatic methods to construct the “DialAMR” corpus—a corpus of human-robot dialogue annotated with standard AMR and our enriched Dialogue-AMR schema. Our automated methods can be used to incorporate AMR into a larger NLU pipeline supporting human-robot dialogue.

pdf bib
Cross-lingual Structure Transfer for Zero-resource Event Extraction
Di Lu | Ananya Subburathinam | Heng Ji | Jonathan May | Shih-Fu Chang | Avi Sil | Clare Voss
Proceedings of The 12th Language Resources and Evaluation Conference

Most of the current cross-lingual transfer learning methods for Information Extraction (IE) have been only applied to name tagging. To tackle more complex tasks such as event extraction we need to transfer graph structures (event trigger linked to multiple arguments with various roles) across languages. We develop a novel share-and-transfer framework to reach this goal with three steps: (1) Convert each sentence in any language to language-universal graph structures; in this paper we explore two approaches based on universal dependency parses and complete graphs, respectively. (2) Represent each node in the graph structure with a cross-lingual word embedding so that all sentences in multiple languages can be represented with one shared semantic space. (3) Using this common semantic space, train event extractors from English training data and apply them to languages that do not have any event annotations. Experimental results on three languages (Spanish, Russian and Ukrainian) without any annotations show this framework achieves comparable performance to a state-of-the-art supervised model trained from more than 1,500 manually annotated event mentions.