Douglas W. Oard

Also published as: Douglas Oard

2020

Text segmentation aims to uncover latent structure by dividing text from a document into coherent sections. Where previous work on text segmentation considers the tasks of document segmentation and segment labeling separately, we show that the tasks contain complementary information and are best addressed jointly. We introduce Segment Pooling LSTM (S-LSTM), which is capable of jointly segmenting a document and labeling segments. In support of joint training, we develop a method for teaching the model to recover from errors by aligning the predicted and ground truth segments. We show that S-LSTM reduces segmentation error by 30% on average, while also improving segment labeling.

pdf bib abs
A Prioritization Model for Suicidality Risk Assessment
Han-Chin Shing | Philip Resnik | Douglas Oard
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We reframe suicide risk assessment from social media as a ranking problem whose goal is maximizing detection of severely at-risk individuals given the time available. Building on measures developed for resource-bounded document retrieval, we introduce a well founded evaluation paradigm, and demonstrate using an expert-annotated test collection that meaningful improvements over plausible cascade model baselines can be achieved using an approach that jointly ranks individuals and their social media posts.

pdf bib
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)
Kathy McKeown | Douglas W. Oard | Elizabeth | Richard Schwartz
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)

At about the midpoint of the IARPA MATERIAL program in October 2019, an evaluation was conducted on systems’ abilities to find Lithuanian documents based on English queries. Subsequently, both the Lithuanian test collection and results from all three teams were made available for detailed analysis. This paper capitalizes on that opportunity to begin to look at what’s working well at this stage of the program, and to identify some promising directions for future work.

Co-authors

Venues

ACL2
CLSSTS2