Quantifying Context Mixing in Transformers

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    Abstract

    Self-attention weights and their transformed variants have been the main source of information for analyzing token-to-token interactions in Transformer-based models. But despite their ease of interpretation, these weights are not faithful to the models’ decisions as they are only one part of an encoder, and other components in the encoder layer can have considerable impact on information mixing in the output representations. In this work, by expanding the scope of analysis to the whole encoder block, we propose Value Zeroing, a novel context mixing score customized for Transformers that provides us with a deeper understanding of how information is mixed at each encoder layer. We demonstrate the superiority of our context mixing score over other analysis methods through a series of complementary evaluations with different viewpoints based on linguistically informed rationales, probing, and faithfulness analysis.
    Original languageEnglish
    Title of host publicationIn Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
    Place of PublicationDubrovnik, Croatia
    PublisherAssociation for Computational Linguistics
    Pages3378-3400
    Number of pages23
    ISBN (Print)978-1-959429-44-9
    Publication statusPublished - May 2023

    Keywords

    • Self-attention
    • Transformer-based models
    • Value Zeroing
    • Encoder layer

    Fingerprint

    Dive into the research topics of 'Quantifying Context Mixing in Transformers'. Together they form a unique fingerprint.

    Cite this