A memory-based shallow parser for spoken Dutch

S.V.M. Canisius, A. van den Bosch

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    97 Downloads (Pure)

    Abstract

    We describe the development of a Dutch memory-based shallow parser. The availability of large treebanks for Dutch, such as the one provided by the Spoken Dutch Corpus, allows memory-based learners to be trained on examples of shallow parsing taken from the treebank, and act as a shallow parser after training. An overview is given of a modular memory-based learning approach to shallow parsing, composed of a part-of-speech tagger-chunker and two grammatical relation finders, which has originally been developed for English. This approach is applied to the syntactically annotated part of the Spoken Dutch Corpus to construct a Dutch shallow parser. From the generalisation scores of the parser we conclude that existing memory-based parsing approaches can be applied to spoken Dutch successfully, but that there is room for improvement in the tagger-chunker
    Original languageEnglish
    Title of host publicationProceedings of the fourteenth CLIN meeting 2003
    EditorsB. Decadt, V. Hoste, G. De Pauw
    Place of PublicationAntwerpen
    PublisherUnknown Publisher
    Pages31-45
    Number of pages15
    ISBN (Print)07763859
    Publication statusPublished - 2004

    Fingerprint Dive into the research topics of 'A memory-based shallow parser for spoken Dutch'. Together they form a unique fingerprint.

    Cite this