The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue

Janosch Haber*, Tim Baumgärtner, Ece Takmaz, Lieke Gelderloos, Elia Bruni, Raquel Fernández

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

This paper introduces the PhotoBook dataset, a large-scale collection of visually-grounded, task-oriented dialogues in English designed to investigate shared dialogue history accumulating during conversation. Taking inspiration from seminal work on dialogue analysis, we propose a data-collection task formulated as a collaborative game prompting two online participants to refer to images utilising both their visual context as well as previously established referring expressions. We provide a detailed description of the task setup and a thorough analysis of the 2,500 dialogues collected. To further illustrate the novel features of the dataset, we propose a baseline model for reference resolution which uses a simple method to take into account shared information accumulated in a reference chain. Our results show that this information is particularly important to resolve later descriptions and underline the need to develop more sophisticated models of common ground in dialogue interaction.
Original languageEnglish
Title of host publicationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics
PublisherAssociation for Computational Linguistics
Pages1895-1910
DOIs
Publication statusPublished - Jul 2019
EventAnnual Meeting of the Association for Computational Linguistics 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019
Conference number: 57
http://www.acl2019.org/EN/index.xhtml

Conference

ConferenceAnnual Meeting of the Association for Computational Linguistics 2019
Abbreviated titleACL 2019
CountryItaly
CityFlorence
Period28/07/192/08/19
Internet address

Cite this

Haber, J., Baumgärtner, T., Takmaz, E., Gelderloos, L., Bruni, E., & Fernández, R. (2019). The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 1895-1910). Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1184
Haber, Janosch ; Baumgärtner, Tim ; Takmaz, Ece ; Gelderloos, Lieke ; Bruni, Elia ; Fernández, Raquel. / The PhotoBook Dataset : Building Common Ground through Visually-Grounded Dialogue. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2019. pp. 1895-1910
@inproceedings{65fefa53445b4e389bb4481c2a8e7205,
title = "The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue",
abstract = "This paper introduces the PhotoBook dataset, a large-scale collection of visually-grounded, task-oriented dialogues in English designed to investigate shared dialogue history accumulating during conversation. Taking inspiration from seminal work on dialogue analysis, we propose a data-collection task formulated as a collaborative game prompting two online participants to refer to images utilising both their visual context as well as previously established referring expressions. We provide a detailed description of the task setup and a thorough analysis of the 2,500 dialogues collected. To further illustrate the novel features of the dataset, we propose a baseline model for reference resolution which uses a simple method to take into account shared information accumulated in a reference chain. Our results show that this information is particularly important to resolve later descriptions and underline the need to develop more sophisticated models of common ground in dialogue interaction.",
author = "Janosch Haber and Tim Baumg{\"a}rtner and Ece Takmaz and Lieke Gelderloos and Elia Bruni and Raquel Fern{\'a}ndez",
year = "2019",
month = "7",
doi = "10.18653/v1/P19-1184",
language = "English",
pages = "1895--1910",
booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
publisher = "Association for Computational Linguistics",

}

Haber, J, Baumgärtner, T, Takmaz, E, Gelderloos, L, Bruni, E & Fernández, R 2019, The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue. in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 1895-1910, Annual Meeting of the Association for Computational Linguistics 2019, Florence, Italy, 28/07/19. https://doi.org/10.18653/v1/P19-1184

The PhotoBook Dataset : Building Common Ground through Visually-Grounded Dialogue. / Haber, Janosch; Baumgärtner, Tim; Takmaz, Ece; Gelderloos, Lieke; Bruni, Elia; Fernández, Raquel.

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2019. p. 1895-1910.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

TY - GEN

T1 - The PhotoBook Dataset

T2 - Building Common Ground through Visually-Grounded Dialogue

AU - Haber, Janosch

AU - Baumgärtner, Tim

AU - Takmaz, Ece

AU - Gelderloos, Lieke

AU - Bruni, Elia

AU - Fernández, Raquel

PY - 2019/7

Y1 - 2019/7

N2 - This paper introduces the PhotoBook dataset, a large-scale collection of visually-grounded, task-oriented dialogues in English designed to investigate shared dialogue history accumulating during conversation. Taking inspiration from seminal work on dialogue analysis, we propose a data-collection task formulated as a collaborative game prompting two online participants to refer to images utilising both their visual context as well as previously established referring expressions. We provide a detailed description of the task setup and a thorough analysis of the 2,500 dialogues collected. To further illustrate the novel features of the dataset, we propose a baseline model for reference resolution which uses a simple method to take into account shared information accumulated in a reference chain. Our results show that this information is particularly important to resolve later descriptions and underline the need to develop more sophisticated models of common ground in dialogue interaction.

AB - This paper introduces the PhotoBook dataset, a large-scale collection of visually-grounded, task-oriented dialogues in English designed to investigate shared dialogue history accumulating during conversation. Taking inspiration from seminal work on dialogue analysis, we propose a data-collection task formulated as a collaborative game prompting two online participants to refer to images utilising both their visual context as well as previously established referring expressions. We provide a detailed description of the task setup and a thorough analysis of the 2,500 dialogues collected. To further illustrate the novel features of the dataset, we propose a baseline model for reference resolution which uses a simple method to take into account shared information accumulated in a reference chain. Our results show that this information is particularly important to resolve later descriptions and underline the need to develop more sophisticated models of common ground in dialogue interaction.

U2 - 10.18653/v1/P19-1184

DO - 10.18653/v1/P19-1184

M3 - Conference contribution

SP - 1895

EP - 1910

BT - Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

PB - Association for Computational Linguistics

ER -

Haber J, Baumgärtner T, Takmaz E, Gelderloos L, Bruni E, Fernández R. The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 2019. p. 1895-1910 https://doi.org/10.18653/v1/P19-1184