Belz, A, Thomson, C, Reiter, E, Abercrombie, G, Alonso-Moral, JM, Arvan, M
, Braggaar, A, Cieliebak, M, Clark, E, van Deemter, K, Dinkar, T, Dušek, O, Eger, S, Fang, Q, Gao, M, Gatt, A, Gkatzia, D, González-Corbelle, J, Hovy, D, Hürlimann, M, Ito, T, Kelleher, JD, Klubicka, F
, Krahmer, E, Lai, H
, van der Lee, C, Li, Y, Mahamood, S, Mieskes, M
, van Miltenburg, E, Mosteiro, P, Nissim, M, Parde, N, Plátek, O, Rieser, V, Ruan, J, Tetreault, J, Toral, A, Wan, X, Wanner, L, Watson, L & Yang, D 2023,
Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP. in S Tafreshi, A Akula, J Sedoc, A Drozd, A Rogers & A Rumshisky (eds),
The Fourth Workshop on Insights from Negative Results in NLP. Association for Computational Linguistics, Dubrovnik, Croatia, pp. 1-10.
https://doi.org/10.18653/v1/2023.insights-1.1