Discrete versus Probabilistic Sequence Classifiers for Domain-specific Entity Chunking

S.V.M. Canisius, A. van den Bosch, W. Daelemans

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    2 Citations (Scopus)
    42 Downloads (Pure)

    Abstract

    We present a comparative case study of discrete and probabilistic sequence classification methods applied to two real-world entity chunking tasks in the medical domain. It is shown that a discrete version of maximum-entropy models that does not coordinate its decisions is outperformed by both architecturally-augmented discrete versions, and probabilistic versions combined with an inference step to select the best output label sequence. In addition, we show that among the various sequence-aware methods evaluated in this study, be they discrete or probabilistic, no significant performance difference could be observed. This suggests that probabilistic sequence labelling methods are not fundamentally more suited for the type of sequence-oriented entity chunking tasks evaluated in this study than augmented discrete approaches. Future research should point out whether this result generalises to more types of sequence tasks in natural language processing.
    Original languageEnglish
    Title of host publicationProceedings of the Eighteenth Belgium-Netherlands Conference on Artificial Intelligence, BNAIC-2006
    EditorsP.-Y. Schobbens, W. Vanhof, G. Schwanen
    Place of PublicationNamur, Belgium
    PublisherBelgisch Nederlandse Ver. voor Kunstmatige Intelligentie
    Pages75-82
    Number of pages8
    Publication statusPublished - 2006

    Fingerprint Dive into the research topics of 'Discrete versus Probabilistic Sequence Classifiers for Domain-specific Entity Chunking'. Together they form a unique fingerprint.

    Cite this