A memory-based shallow parser for spoken Dutch

S.V.M. Canisius, A. van den Bosch

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    156 Downloads (Pure)

    Abstract

    We describe the development of a Dutch memory-based shallow parser. The availability of large treebanks for Dutch, such as the one provided by the Spoken Dutch Corpus, allows memory-based learners to be trained on examples of shallow parsing taken from the treebank, and act as a shallow parser after training. An overview is given of a modular memory-based learning approach to shallow parsing, composed of a part-of-speech tagger-chunker and two grammatical relation finders, which has originally been developed for English. This approach is applied to the syntactically annotated part of the Spoken Dutch Corpus to construct a Dutch shallow parser. From the generalisation scores of the parser we conclude that existing memory-based parsing approaches can be applied to spoken Dutch successfully, but that there is room for improvement in the tagger-chunker
    Original languageEnglish
    Title of host publicationProceedings of the fourteenth CLIN meeting 2003
    EditorsB. Decadt, V. Hoste, G. De Pauw
    Place of PublicationAntwerpen
    PublisherUnknown Publisher
    Pages31-45
    Number of pages15
    ISBN (Print)07763859
    Publication statusPublished - 2004

    Fingerprint

    Dive into the research topics of 'A memory-based shallow parser for spoken Dutch'. Together they form a unique fingerprint.
    • Rolaquad: Robust language understanding for question - answering dialogues.

      Canisius, S. V. M. (Researcher), Daelemans, W. M. P. (Principal Investigator), Lendvai, P. K. (Researcher) & van den Bosch, A. (Coach)

      1/01/041/01/08

      Project: Research project

    • Memory models of language

      van den Bosch, A. (Researcher)

      1/07/011/07/06

      Project: Research project

    Cite this