Quantifying the Plausibility of Context Reliance in Neural Machine Translation

Gabriele Sarti, Grzegorz Chrupała, Malvina Nissim, Arianna Bisazza

    Research output: Contribution to conferencePaperScientificpeer-review

    27 Downloads (Pure)

    Abstract

    Establishing whether language models can use contextual information in a human-plausible way is important to ensure their trustworthiness in real-world settings. However, the questions of when and which parts of the context affect model generations are typically tackled separately, with current plausibility evaluations being practically limited to a handful of artificial benchmarks. To address this, we introduce Plausibility Evaluation of Context Reliance (PECoRe), an end-to-end interpretability framework designed to quantify context usage in language models' generations. Our approach leverages model internals to (i) contrastively identify context-sensitive target tokens in generated texts and (ii) link them to contextual cues justifying their prediction. We use \pecore to quantify the plausibility of context-aware machine translation models, comparing model rationales with human annotations across several discourse-level phenomena. Finally, we apply our method to unannotated model translations to identify context-mediated predictions and highlight instances of (im)plausible context usage throughout generation.
    Original languageEnglish
    Number of pages29
    DOIs
    Publication statusPublished - 16 Jan 2024
    Event The International Conference on Learning Representations (ICLR) - Messe Wien exhibition and congress center, Vienna, Austria
    Duration: 7 May 202411 May 2024
    https://iclr.cc/Conferences/2024

    Conference

    Conference The International Conference on Learning Representations (ICLR)
    Country/TerritoryAustria
    CityVienna
    Period7/05/2411/05/24
    Internet address

    Keywords

    • neural machine translation
    • large language models

    Fingerprint

    Dive into the research topics of 'Quantifying the Plausibility of Context Reliance in Neural Machine Translation'. Together they form a unique fingerprint.

    Cite this