Abstract
Comics present images in a sequence, where the spatially presented sequence is key to the narrative storytelling. To understand a comic, a comprehender must learn to encode this sequential nature. For this we present a novel self-supervised sequential representation learning method designed for comics. Our approach capitalises on the sequential structure of comics to incorporate contextual information. We conduct experiments on the TINTIN Corpus of 1,000+ comics from 144 countries, and show that our method outperforms baseline methods on both classification and retrieval tasks. These results affirm the effectiveness of sequential representation learning for comics, and may aid in uncovering new cultural insights within comics.
| Original language | English |
|---|---|
| Title of host publication | 35th British Machine Vision Conference 2024 |
| Number of pages | 13 |
| Publication status | Published - 2024 |
| Event | The 35th British Machine Vision Conference - Glasgow, United Kingdom Duration: 25 Nov 2024 → 28 Nov 2024 Conference number: 35 |
Conference
| Conference | The 35th British Machine Vision Conference |
|---|---|
| Abbreviated title | BMVC 2024 |
| Country/Territory | United Kingdom |
| City | Glasgow |
| Period | 25/11/24 → 28/11/24 |
Keywords
- comics
- visual language
- machine learning
- large language models
- style
- style analysis