﻿<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="nallani-etal-2020-fully">
    <titleInfo>
        <title>A Fully Expanded Dependency Treebank for Telugu</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Sneha</namePart>
        <namePart type="family">Nallani</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Manish</namePart>
        <namePart type="family">Shrivastava</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Dipti</namePart>
        <namePart type="family">Sharma</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2020-may</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <language>
        <languageTerm type="text">English</languageTerm>
        <languageTerm type="code" authority="iso639-2b">eng</languageTerm>
    </language>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the WILDRE5– 5th Workshop on Indian Language Data: Resources and Evaluation</title>
        </titleInfo>
        <originInfo>
            <publisher>European Language Resources Association (ELRA)</publisher>
            <place>
                <placeTerm type="text">Marseille, France</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-10-95546-67-2</identifier>
    </relatedItem>
    <abstract>Treebanks are an essential resource for syntactic parsing. The available Paninian dependency treebank(s) for Telugu is annotated only with inter-chunk dependency relations and not all words of a sentence are part of the parse tree. In this paper, we automatically annotate the intra-chunk dependencies in the treebank using a Shift-Reduce parser based on Context Free Grammar rules for Telugu chunks. We also propose a few additional intra-chunk dependency relations for Telugu apart from the ones used in Hindi treebank. Annotating intra-chunk dependencies finally provides a complete parse tree for every sentence in the treebank. Having a fully expanded treebank is crucial for developing end to end parsers which produce complete trees. We present a fully expanded dependency treebank for Telugu consisting of 3220 sentences. In this paper, we also convert the treebank annotated with Anncorra part-of-speech tagset to the latest BIS tagset. The BIS tagset is a hierarchical tagset adopted as a unified part-of-speech standard across all Indian Languages. The final treebank is made publicly available.</abstract>
    <identifier type="citekey">nallani-etal-2020-fully</identifier>
    <location>
        <url>https://www.aclweb.org/anthology/2020.wildre-1.8</url>
    </location>
    <part>
        <date>2020-may</date>
        <extent unit="page">
            <start>39</start>
            <end>44</end>
        </extent>
    </part>
</mods>
</modsCollection>
