Typoo or orthographic error? Automatic classification of typographic versus orthographic errors using keystroke log

Rianne Conijn, Luuk van Waes, Menno van Zaanen

    Research output: Contribution to conferenceAbstractOther research output

    Abstract

    The automatic classification and correction of typing errors in texts has been well-studied, e.g., [1], [2]. Yet, relatively little work can be found on the classification of typographic errors (slips of the finger) versus orthographic errors. In writing research, these errors should be treated separately, as these are cognitively different actions and can have a large influence on, for example, fluency analysis and counts of revisions [3], [4]. This distinction is hard to make using the final writing product only. By analyzing typing errors during the writing process, using keystroke logging, we gain information on both the (timing of the) production and correction of typing errors [5], [6]. Several studies have used these keystroke logs to manually code typographic and orthographic errors, e.g., [7], [8]. In this project, we aim to automatically distinguish between typographic and orthographic errors. This presentation shows our first step: the characterization of typographic errors using keystroke logs from a transcription task. In a transcription task, the final text is given to the writer, hence we assume every revision is a typographic error. Data from 2,103 Dutch transcription tasks (1,717 unique participants) were collected using Inputlog [9]. Character-level confusion matrices (as in [10]) are constructed and patterns of timings are reported. In total, 5,030 corrections were made, of which 59% single substitutions, 5% single transposition, 4% single insertions, and 1% single deletions. In 27% of the revisions more than one mutation was used, and in 4% nothing changed. We invite attendees to discuss our future steps.
    Original languageEnglish
    Publication statusPublished - 2019
    EventThe 29th Computational Linguistics in the Netherlands conference (CLIN29) - Groningen, Netherlands
    Duration: 31 Jan 2019 → …

    Conference

    ConferenceThe 29th Computational Linguistics in the Netherlands conference (CLIN29)
    Country/TerritoryNetherlands
    CityGroningen
    Period31/01/19 → …

    Fingerprint

    Dive into the research topics of 'Typoo or orthographic error? Automatic classification of typographic versus orthographic errors using keystroke log'. Together they form a unique fingerprint.

    Cite this