Homepage Probabilistic Grammars and Data Oriented Parsing     



Background material on Data-Oriented Parsing


DOP: the basic idea
.

Remko Scha: "Taaltheorie en taaltechnologie; competence en performance." In: R. de Kort and G.L.J. Leerdam (eds.): Computertoepassingen in de Neerlandistiek. Almere: LVVN, 1990, pp. 7-22. [Translated into English as: "Language Theory and Language Technology; Competence and Performance."]

Remko Scha: "Virtuele Grammatica's en Creatieve Algoritmes." Gramma/TTT 1, 1 (1992), pp. 57-77. [Translated into English as: "Virtual Grammars and Creative Algorithms."]

Khalil Sima'an (2003). A Short Introduction to the DOP Model.

Khalil Sima'an: Lecture on Data-Oriented Parsing. (Khalil Sima'an and Detlef Prescher: Course on Probabilistic Parsing, ESSLI 2003.)

DOP1

Rens Bod: Enriching Linguistics with Statistics: Performance Models of Natural Language. 1995. (Promotor: Remko Scha.) ILLC Dissertation Series 1995-14.

Rens Bod and Remko Scha: "Data-Oriented Language Processing. An Overview." Technical Report LP-96-13, Institute for Logic, Language and Computation, University of Amsterdam, 1997. cmp-lg/9611003.

Alternative estimators

Remko Bonnema, Paul Buying and Remko Scha: "A New Probability Model for Data Oriented Parsing." In: Paul Dekker (ed.): Proceedings of the Twelfth Amsterdam Colloquium, December 18-21, 1999, pp. 85-90.

K. Sima'an and L. Buratto: Backoff Parameter Estimation for the DOP Model. In N. Lavrac, D. Gamberger, H. Blockeel and L. Todorovski (eds.): Proceedings of the European Conference on Machine Learning (ECML'03), Lecture Notes in Artificial Intelligence (LNAI 2837), pages 373-384, Springer, 2003.

Computational Issues

Khalil Sima'an: Learning Efficient Disambiguation. March 31, 1999. Utrecht University, Utrecht Institute of Linguistics OTS. ILLC Dissertation Series 1999-02. (Promotors: Jan Landsbergen and Remko Scha.)

K. Sima'an:ÊComputational Complexity of Probabilistic Disambiguation. NP-Completeness results for parsing problems that arise in speech and language processing applications. Grammars,ÊÊ Vol. 5 (2),ÊÊÊ Kluwer Publishers, 2002.

Integrating Semantics

Martin van den Berg, Rens Bod and Remko Scha: "A Corpus-Based Approach to Semantic Interpretation." In: P. Dekker and M. Stokhof (eds.): Proceedings of the Ninth Amsterdam Colloquium. ILLC, University of Amsterdam, 1994, pp. 141-160.

[Reprinted with minor additions as Chapter 8 ("Further Extensions of DOP: Semantics, Discourse, Recency") in: Rens Bod: Enriching Linguistics with Statistics: Performance Models of Natural Language (ILLC Dissertation Series 1995-14, University of Amsterdam, 1995)]

Remko Bonnema, Rens Bod and Remko Scha: "A DOP Model for Semantic Interpretation." Proceedings 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics (July 7-12, 1997, Madrid, Spain), pp. 159-167.

Tree-Adjoining Grammar

Lars Hoogweg: Extending DOP1 with the insertion operation. M.A. Thesis, Department of Computational Linguistics, University of Amsterdam, 2000.

Lexical-Functional Grammar

M. Hearne and K. Sima'an: Structured Parameter Estimation for LFG-DOP by Backoff. In Proceedings of International Conference on Recent Advances in Natural Language Processing (RANLP'03), ÊBulgaria, 2003.

R. Bod and R. Kaplan: A Data-Oriented Parsing Model for Lexical-Functional Grammar. Submitted for publication, comments are welcome.

Tree-grams

K. Sima'an: Tree-gram Parsing: Lexical Dependencies and Structural Relations. Proceedings of 38thÊ Annual MeetingÊ of the Association for Computational Linguistics (ACL'00), Hong Kong, 2000.

Anthology.

Rens Bod, Remko Scha and Khalil Sima'an (eds.): Data-Oriented Parsing. Stanford: CSLI Publications, 2003. 410 pp.

Language Acquisition

I.M. Schlesinger: "Learning grammar: from pivot to realization rule." In: R. Huxley and E. Ingram (eds.): Studies in Language Acquiusition: Models and Methods. New York: Academic Press, 1971, pp. 79-89.

I.M. Schlesinger: "Production of utterances and language acquisition." In: D.I. Slobin (ed.): The Ontogenesis of Grammar. New York: Academic Press, 1971, pp. 63-102.

Patrick Suppes: "Semantics of Context-Free Fragments of Natural Languages." In: K.J.J. Hintikka, J.M.E. Moravcsik and P. Suppes (eds.): Approaches to Natural Language. Dordrecht: Reidel, 1973, pp. 370-394. Reprinted in: Patrick Suppes: Language for Humans and Robots. Oxford, UK: Blackwell, 1991, pp. 167-190.

I.M. Schlesinger: "Grammatical Development – The First Steps." In: Eric H. Lenneberg and Elizabeth Lenneberg (eds.): Foundations of Language Development. A Multidisciplinary Approach. Vol. 1. New York: Academic Press, 1975, pp. 203-222.

H. Sinclair: "The Role of Cognitive Structures in Language Acquisition." In: Eric H. Lenneberg and Elizabeth Lenneberg (eds.): Foundations of Language Development. A Multidisciplinary Approach. Vol. 1. New York: Academic Press, 1975, pp. 223-238.

Patrick Suppes: "Syntax and Semantics of Children's Language." In: S.R. Harnad, H.D. Steklis and J. Lancaster (eds.): Origins and evolution of language and speech. Annals of the New York Academy of Sciences, 280 (1976), pp. 227-237. Reprinted in: Patrick Suppes: Language for Humans and Robots. Oxford, UK: Blackwell, 1991, pp. 119-132.

N. Chang and T. Maia. Learning Grammatical Constructions (postscript, pdf). Presented at the 2001 Meeting of the Cognitive Science Society. Edinburgh, August 2001.

Nancy Chang: Learning Natural Language: A review of formal and computational approaches.

Mike de Kreek: Language Acquisition and Virtual Grammars. On the Continuity of Cognition. M.A. Thesis, Department of Computational Linguistics, University of Amsterdam, 2003. (Supervisor: Remko Scha.)