A memory-based shallow parser for spoken Dutch

S.V.M. Canisius, A. van den Bosch

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    137 Downloads (Pure)


    We describe the development of a Dutch memory-based shallow parser. The availability of large treebanks for Dutch, such as the one provided by the Spoken Dutch Corpus, allows memory-based learners to be trained on examples of shallow parsing taken from the treebank, and act as a shallow parser after training. An overview is given of a modular memory-based learning approach to shallow parsing, composed of a part-of-speech tagger-chunker and two grammatical relation finders, which has originally been developed for English. This approach is applied to the syntactically annotated part of the Spoken Dutch Corpus to construct a Dutch shallow parser. From the generalisation scores of the parser we conclude that existing memory-based parsing approaches can be applied to spoken Dutch successfully, but that there is room for improvement in the tagger-chunker
    Original languageEnglish
    Title of host publicationProceedings of the fourteenth CLIN meeting 2003
    EditorsB. Decadt, V. Hoste, G. De Pauw
    Place of PublicationAntwerpen
    PublisherUnknown Publisher
    Number of pages15
    ISBN (Print)07763859
    Publication statusPublished - 2004


    Dive into the research topics of 'A memory-based shallow parser for spoken Dutch'. Together they form a unique fingerprint.

    Cite this