William Schuler


2020

pdf bib
A Corpus of Encyclopedia Articles with Logical Forms
Nathan Rasmussen | William Schuler
Proceedings of The 12th Language Resources and Evaluation Conference

People can extract precise, complex logical meanings from text in documents such as tax forms and game rules, but language processing systems lack adequate training and evaluation resources to do these kinds of tasks reliably. This paper describes a corpus of annotated typed lambda calculus translations for approximately 2,000 sentences in Simple English Wikipedia, which is assumed to constitute a broad-coverage domain for precise, complex descriptions. The corpus described in this paper contains a large number of quantifiers and interesting scoping configurations, and is presented specifically as a resource for quantifier scope disambiguation systems, but also more generally as an object of linguistic study.

pdf bib
Memory-bounded Neural Incremental Parsing for Psycholinguistic Prediction
Lifeng Jin | William Schuler
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

Syntactic surprisal has been shown to have an effect on human sentence processing, and can be predicted from prefix probabilities of generative incremental parsers. Recent state-of-the-art incremental generative neural parsers are able to produce accurate parses and surprisal values but have unbounded stack memory, which may be used by the neural parser to maintain explicit in-order representations of all previously parsed words, inconsistent with results of human memory experiments. In contrast, humans seem to have a bounded working memory, demonstrated by inhibited performance on word recall in multi-clause sentences (Bransford and Franks, 1971), and on center-embedded sentences (Miller and Isard,1964). Bounded statistical parsers exist, but are less accurate than neural parsers in predict-ing reading times. This paper describes a neural incremental generative parser that is able to provide accurate surprisal estimates and can be constrained to use a bounded stack. Results show that the accuracy gains of neural parsers can be reliably extended to psycholinguistic modeling without risk of distortion due to un-bounded working memory.

pdf bib
The Importance of Category Labels in Grammar Induction with Child-directed Utterances
Lifeng Jin | William Schuler
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

Recent progress in grammar induction has shown that grammar induction is possible without explicit assumptions of language specific knowledge. However, evaluation of induced grammars usually has ignored phrasal labels, an essential part of a grammar. Experiments in this work using a labeled evaluation metric, RH, show that linguistically motivated predictions about grammar sparsity and use of categories can only be revealed through labeled evaluation. Furthermore, depth-bounding as an implementation of human memory constraints in grammar inducers is still effective with labeled evaluation on multilingual transcribed child-directed utterances.