Abstract
Comics present images in a sequence, where the spatially presented sequence is key to the narrative storytelling. To understand a comic, a comprehender must learn to encode this sequential nature. For this we present a novel self-supervised sequential representation learning method designed for comics. Our approach capitalises on the sequential structure of comics to incorporate contextual information. We conduct experiments on the TINTIN Corpus of 1,000+ comics from 144 countries, and show that our method outperforms baseline methods on both classification and retrieval tasks. These results affirm the effectiveness of sequential representation learning for comics, and may aid in uncovering new cultural insights within comics.
Original language | English |
---|---|
Title of host publication | 35th British Machine Vision Conference 2024 |
Publication status | Published - 2024 |
Keywords
- comics
- visual language
- machine learning
- large language models
- style
- style analysis