The Visual Narrative Engine: A Computational Model of the Visual Narrative Parallel Architecture.

Chris Martens, Rogelio Cardona-Rivera, Neil Cohn

Research output: Contribution to conferencePaperOther research output


This paper introduces the first computational model of a proposed set of cognitive structures and processes involved in interpreting visual narratives (specifically, comics). The study of cognitive processes that occur when readers interpret comics has been theorized and experimentally validated to include spatial semantics—a mental model of the physical environment depicted in each image— and event structure semantics—a hierarchy of narrative events that take place over time. These semantic models are interconnected, developing in parallel as readers interpret a panel sequence. Our computational model aims to bring clarity to the cognitive theory of visual narrative sensemaking and poses new questions for the cognitive science community. Towards this end, we present a prototype computational model in which event structures are represented as hierarchical plans and spatial structures are represented as relational scene graphs. Domain knowledge about narrative events and how they relate to underlying scene structure is encoded in a standard Hierarchical Task Network (HTN) planning domain representation. Using a standard implementation of HTN planning, we demonstrate how to search the space of possible HTN solutions for ones that match with comic panel sequences.
Original languageEnglish
Publication statusPublished - 2020
Event8th Annual Conference on Advances in Cognitive Systems - Palo Alto , United States
Duration: 10 Aug 202012 Aug 2020


Conference8th Annual Conference on Advances in Cognitive Systems
Country/TerritoryUnited States
CityPalo Alto


  • narrative
  • computational narrative
  • computational linguistics
  • storytelling
  • visual narrative
  • comics
  • parallel architecture


Dive into the research topics of 'The Visual Narrative Engine: A Computational Model of the Visual Narrative Parallel Architecture.'. Together they form a unique fingerprint.

Cite this