Amrith Krishna

2020

pdf bib abs
SHR++: An Interface for Morpho-syntactic Annotation of Sanskrit Corpora
Amrith Krishna | Shiv Vidhyut | Dilpreet Chawla | Sruti Sambhavi | Pawan Goyal
Proceedings of The 12th Language Resources and Evaluation Conference

We propose a web-based annotation framework, SHR++, for morpho-syntactic annotation of corpora in Sanskrit. SHR++ is designed to generate annotations for the word-segmentation, morphological parsing and dependency analysis tasks in Sanskrit. It incorporates analyses and predictions from various tools designed for processing texts in Sanskrit, and utilise them to ease the cognitive load of the human annotators. Specifically, SHR++ uses Sanskrit Heritage Reader, a lexicon driven shallow parser for enumerating all the phonetically and lexically valid word splits along with their morphological analyses for a given string. This would help the annotators in choosing the solutions, rather than performing the segmentations by themselves. Further, predictions from a word segmentation tool are added as suggestions that can aid the human annotators in their decision making. Our evaluation shows that enabling this segmentation suggestion component reduces the annotation time by 20.15 %. SHR++ can be accessed online at http://vidhyut97.pythonanywhere.com/ and the codebase, for the independent deployment of the system elsewhere, is hosted at https://github.com/iamdsc/smart-sanskrit-annotator.

pdf bib abs
Evaluating Neural Morphological Taggers for Sanskrit
Ashim Gupta | Amrith Krishna | Pawan Goyal | Oliver Hellwig
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

Neural sequence labelling approaches have achieved state of the art results in morphological tagging. We evaluate the efficacy of four standard sequence labelling models on Sanskrit, a morphologically rich, fusional Indian language. As its label space can theoretically contain more than 40,000 labels, systems that explicitly model the internal structure of a label are more suited for the task, because of their ability to generalise to labels not seen during training. We find that although some neural models perform better than others, one of the common causes for error for all of these models is mispredictions due to syncretism.

Co-authors

Oliver Hellwig 1

Venues

LREC1
SIGMORPHON1