2020
pdf
bib
abs
Unsupervised Paraphasia Classification in Aphasic Speech
Sharan Pai
|
Nikhil Sachdeva
|
Prince Sachdeva
|
Rajiv Ratn Shah
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Aphasia is a speech and language disorder which results from brain damage, often characterized by word retrieval deficit (anomia) resulting in naming errors (paraphasia). Automatic paraphasia detection has many benefits for both treatment and diagnosis of Aphasia and its type. But supervised learning methods cant be properly utilized as there is a lack of aphasic speech data. In this paper, we describe our novel unsupervised method which can be implemented without the need for labeled paraphasia data. Our evaluations show that our method outperforms previous work based on supervised learning and transfer learning approaches for English. We demonstrate the utility of our method as an essential first step in developing augmentative and alternative communication (AAC) devices for patients suffering from aphasia in any language.
pdf
bib
abs
An Annotated Dataset of Discourse Modes in Hindi Stories
Swapnil Dhanwal
|
Hritwik Dutta
|
Hitesh Nankani
|
Nilay Shrivastava
|
Yaman Kumar
|
Junyi Jessy Li
|
Debanjan Mahata
|
Rakesh Gosangi
|
Haimin Zhang
|
Rajiv Ratn Shah
|
Amanda Stent
Proceedings of The 12th Language Resources and Evaluation Conference
In this paper, we present a new corpus consisting of sentences from Hindi short stories annotated for five different discourse modes argumentative, narrative, descriptive, dialogic and informative. We present a detailed account of the entire data collection and annotation processes. The annotations have a very high inter-annotator agreement (0.87 k-alpha). We analyze the data in terms of label distributions, part of speech tags, and sentence lengths. We characterize the performance of various classification algorithms on this dataset and perform ablation studies to understand the nature of the linguistic models suitable for capturing the nuances of the embedded discourse structures in the presented corpus.
pdf
bib
abs
Semi-Supervised Iterative Approach for Domain-Specific Complaint Detection in Social Media
Akash Gautam
|
Debanjan Mahata
|
Rakesh Gosangi
|
Rajiv Ratn Shah
Proceedings of The 3rd Workshop on e-Commerce and NLP
In this paper, we present a semi-supervised bootstrapping approach to detect product or service related complaints in social media. Our approach begins with a small collection of annotated samples which are used to identify a preliminary set of linguistic indicators pertinent to complaints. These indicators are then used to expand the dataset. The expanded dataset is again used to extract more indicators. This process is applied for several iterations until we can no longer find any new indicators. We evaluated this approach on a Twitter corpus specifically to detect complaints about transportation services. We started with an annotated set of 326 samples of transportation complaints, and after four iterations of the approach, we collected 2,840 indicators and over 3,700 tweets. We annotated a random sample of 700 tweets from the final dataset and observed that nearly half the samples were actual transportation complaints. Lastly, we also studied how different features based on semantics, orthographic properties, and sentiment contribute towards the prediction of complaints.