Architectures and representations for string transduction

    Research output: Contribution to conferenceAbstractOther research output

    Abstract

    String transduction problems are ubiquitous in natural language
    processing: they include transliteration, grapheme-to-phoneme
    conversion, text normalization and translation. String transduction
    can be reduced to the simpler problems of sequence labeling by
    expressing the target string as a sequence of edit operations applied
    to the source string. Due to this reduction all sequence labeling
    models become applicable in typical transduction settings. Sequence
    models range from simple linear models such as sequence perceptron
    which require external feature extractors to recurrent neural
    networks with long short-term memory (LSTM) units which can do feature
    extraction internally. Versions of recurrent neural networks are also
    capable of solving string transduction natively, without reformulating
    it in terms of edit operations. In this talk I analyze the effect of
    these variations in model architecture and input representation on
    performance and engineering effort for string transduction, focusing
    especially on the text normalization task.
    Original languageEnglish
    Publication statusPublished - 2015
    EventThe 25th Meeting of Computational Linguistics in the Netherlands (CLIN25) - , Belgium
    Duration: 5 Feb 20156 Feb 2015

    Conference

    ConferenceThe 25th Meeting of Computational Linguistics in the Netherlands (CLIN25)
    CountryBelgium
    Period5/02/156/02/15

    Fingerprint

    Recurrent neural networks
    Labeling
    Long short-term memory

    Cite this

    Chrupala, G. (2015). Architectures and representations for string transduction. Abstract from The 25th Meeting of Computational Linguistics in the Netherlands (CLIN25), Belgium.
    Chrupala, Grzegorz. / Architectures and representations for string transduction. Abstract from The 25th Meeting of Computational Linguistics in the Netherlands (CLIN25), Belgium.
    @conference{42ac967419c3420eb6bedcf0e91bd3a3,
    title = "Architectures and representations for string transduction",
    abstract = "String transduction problems are ubiquitous in natural languageprocessing: they include transliteration, grapheme-to-phonemeconversion, text normalization and translation. String transductioncan be reduced to the simpler problems of sequence labeling byexpressing the target string as a sequence of edit operations appliedto the source string. Due to this reduction all sequence labelingmodels become applicable in typical transduction settings. Sequencemodels range from simple linear models such as sequence perceptronwhich require external feature extractors to recurrent neuralnetworks with long short-term memory (LSTM) units which can do featureextraction internally. Versions of recurrent neural networks are alsocapable of solving string transduction natively, without reformulatingit in terms of edit operations. In this talk I analyze the effect ofthese variations in model architecture and input representation onperformance and engineering effort for string transduction, focusingespecially on the text normalization task.",
    author = "Grzegorz Chrupala",
    year = "2015",
    language = "English",
    note = "The 25th Meeting of Computational Linguistics in the Netherlands (CLIN25) ; Conference date: 05-02-2015 Through 06-02-2015",

    }

    Chrupala, G 2015, 'Architectures and representations for string transduction' The 25th Meeting of Computational Linguistics in the Netherlands (CLIN25), Belgium, 5/02/15 - 6/02/15, .

    Architectures and representations for string transduction. / Chrupala, Grzegorz.

    2015. Abstract from The 25th Meeting of Computational Linguistics in the Netherlands (CLIN25), Belgium.

    Research output: Contribution to conferenceAbstractOther research output

    TY - CONF

    T1 - Architectures and representations for string transduction

    AU - Chrupala, Grzegorz

    PY - 2015

    Y1 - 2015

    N2 - String transduction problems are ubiquitous in natural languageprocessing: they include transliteration, grapheme-to-phonemeconversion, text normalization and translation. String transductioncan be reduced to the simpler problems of sequence labeling byexpressing the target string as a sequence of edit operations appliedto the source string. Due to this reduction all sequence labelingmodels become applicable in typical transduction settings. Sequencemodels range from simple linear models such as sequence perceptronwhich require external feature extractors to recurrent neuralnetworks with long short-term memory (LSTM) units which can do featureextraction internally. Versions of recurrent neural networks are alsocapable of solving string transduction natively, without reformulatingit in terms of edit operations. In this talk I analyze the effect ofthese variations in model architecture and input representation onperformance and engineering effort for string transduction, focusingespecially on the text normalization task.

    AB - String transduction problems are ubiquitous in natural languageprocessing: they include transliteration, grapheme-to-phonemeconversion, text normalization and translation. String transductioncan be reduced to the simpler problems of sequence labeling byexpressing the target string as a sequence of edit operations appliedto the source string. Due to this reduction all sequence labelingmodels become applicable in typical transduction settings. Sequencemodels range from simple linear models such as sequence perceptronwhich require external feature extractors to recurrent neuralnetworks with long short-term memory (LSTM) units which can do featureextraction internally. Versions of recurrent neural networks are alsocapable of solving string transduction natively, without reformulatingit in terms of edit operations. In this talk I analyze the effect ofthese variations in model architecture and input representation onperformance and engineering effort for string transduction, focusingespecially on the text normalization task.

    M3 - Abstract

    ER -

    Chrupala G. Architectures and representations for string transduction. 2015. Abstract from The 25th Meeting of Computational Linguistics in the Netherlands (CLIN25), Belgium.