On the use of human reference data for evaluating automatic image descriptions

    Research output: Contribution to conferencePosterOther research output

    73 Downloads (Pure)

    Abstract

    Automatic image description systems are commonly trained and evaluated using crowdsourced, human-generated image descriptions. The best-performing system is then determined using some measure of similarity to the reference data (BLEU, Meteor, CIDER, etc). Thus, both the quality of the systems as well as the quality of the evaluation depends on the quality of the descriptions. As Section 2 will show, the quality of current image description datasets is insufficient. I argue that there is a need for more detailed guidelines that take into account the needs of visually impaired users, but also the feasibility of generating suitable descriptions. With high-quality data, evaluation of image description systems could use reference descriptions, but we should also look for alternatives.
    Original languageEnglish
    Number of pages2
    Publication statusPublished - 14 Jun 2020
    Event2020 VizWiz Grand Challenge Workshop -
    Duration: 14 Jun 202014 Jun 2020
    https://vizwiz.org/workshops/2020-workshop/

    Conference

    Conference2020 VizWiz Grand Challenge Workshop
    Abbreviated titleVizWiz 2020
    Period14/06/2014/06/20
    Internet address

    Fingerprint

    Dive into the research topics of 'On the use of human reference data for evaluating automatic image descriptions'. Together they form a unique fingerprint.

    Cite this