Evaluation rules! On the use of grammars and rule-based systems for NLG evaluation

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    Abstract

    NLG researchers often use uncontrolled corpora to train and evaluate their systems, using textual similarity metrics, such as BLEU. This position paper argues in favour of two alternative evaluation strategies, using grammars or rule-based systems. These strategies are particularly useful to identify the strengths and weaknesses of different systems. We contrast our proposals with the (extended) WebNLG dataset, which is revealed to have a skewed distribution of predicates. We predict that this distribution affects the quality of the predictions for systems trained on this data. However, this hypothesis can only be thoroughly tested (without any confounds) once we are able to systematically manipulate the skewness of the data, using a rule-based approach.
    Original languageEnglish
    Title of host publicationProceedings of the 1st Workshop on Evaluating NLG Evaluation
    Place of PublicationDublin, Ireland
    PublisherAssociation for Computational Linguistics
    Pages17-27
    Number of pages11
    Publication statusPublished - 1 Dec 2020
    EventWorkshop on Evaluating NLG Evaluation - Online, Dublin, Ireland
    Duration: 18 Dec 2020 → …
    https://evalnlg-workshop.github.io/

    Conference

    ConferenceWorkshop on Evaluating NLG Evaluation
    Country/TerritoryIreland
    CityDublin
    Period18/12/20 → …
    Internet address

    Fingerprint

    Dive into the research topics of 'Evaluation rules! On the use of grammars and rule-based systems for NLG evaluation'. Together they form a unique fingerprint.

    Cite this