TY - JOUR
T1 - Taming our wild data
T2 - On intercoder reliability in discourse research
AU - van Enschot, Renske
AU - Spooren, Wilbert
AU - van den Bosch, Antal
AU - Burgers, Christian
AU - Degand, Liesbeth
AU - Evers-Vermeul, Jacqueline
AU - Kunneman, Florian
AU - Liebrecht, Christine
AU - Linders, Yvette
AU - Maes, Alfons
PY - 2024
Y1 - 2024
N2 - Many research questions in the field of applied linguistics are answered by manually analyzing data collections or corpora: collections of spoken, written and/or visual communicative messages. In this kind of quantitative content analysis, the coding of subjective language data often leads to disagreement among raters. In this paper, we discuss causes of and solutions to disagreement problems in the analysis of discourse. We discuss crucial factors determining the quality and outcome of corpus analyses, and focus on the sometimes tense relation between reliability and validity. We evaluate formal assessments of intercoder reliability. We suggest a number of ways to improve the intercoder reliability, such as the precise specification of the variables and their coding categories and carving up the coding process into smaller substeps. The paper ends with a reflection on challenges for future work in discourse analysis, with special attention to big data and multimodal discourse.
AB - Many research questions in the field of applied linguistics are answered by manually analyzing data collections or corpora: collections of spoken, written and/or visual communicative messages. In this kind of quantitative content analysis, the coding of subjective language data often leads to disagreement among raters. In this paper, we discuss causes of and solutions to disagreement problems in the analysis of discourse. We discuss crucial factors determining the quality and outcome of corpus analyses, and focus on the sometimes tense relation between reliability and validity. We evaluate formal assessments of intercoder reliability. We suggest a number of ways to improve the intercoder reliability, such as the precise specification of the variables and their coding categories and carving up the coding process into smaller substeps. The paper ends with a reflection on challenges for future work in discourse analysis, with special attention to big data and multimodal discourse.
KW - intercoder reliability
KW - discourse
KW - quantitative content analysis
KW - complex discourse data
KW - hands-on procedures
U2 - 10.51751/dujal16248
DO - 10.51751/dujal16248
M3 - Article
SN - 2211-7245
VL - 13
SP - 1
EP - 24
JO - Dutch Journal of Applied Linguistics
JF - Dutch Journal of Applied Linguistics
ER -