Architectures and representations for string transduction

    Research output: Contribution to conferenceAbstractOther research output


    String transduction problems are ubiquitous in natural language
    processing: they include transliteration, grapheme-to-phoneme
    conversion, text normalization and translation. String transduction
    can be reduced to the simpler problems of sequence labeling by
    expressing the target string as a sequence of edit operations applied
    to the source string. Due to this reduction all sequence labeling
    models become applicable in typical transduction settings. Sequence
    models range from simple linear models such as sequence perceptron
    which require external feature extractors to recurrent neural
    networks with long short-term memory (LSTM) units which can do feature
    extraction internally. Versions of recurrent neural networks are also
    capable of solving string transduction natively, without reformulating
    it in terms of edit operations. In this talk I analyze the effect of
    these variations in model architecture and input representation on
    performance and engineering effort for string transduction, focusing
    especially on the text normalization task.
    Original languageEnglish
    Publication statusPublished - 2015
    EventThe 25th Meeting of Computational Linguistics in the Netherlands (CLIN25) - , Belgium
    Duration: 5 Feb 20156 Feb 2015


    ConferenceThe 25th Meeting of Computational Linguistics in the Netherlands (CLIN25)


    Dive into the research topics of 'Architectures and representations for string transduction'. Together they form a unique fingerprint.

    Cite this