OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    4 Citations (Scopus)
    228 Downloads (Pure)

    Abstract

    We present further work on evaluation of the fully automatic post-correction of Early Dutch Books Online, a collection of 10,333 18th century books. In prior work we evaluated the new implementation of Text-Induced Corpus Clean-up (TICCL) on the basis of a single book Gold Standard derived from this collection. In the current paper we revisit the same collection on the basis of a sizeable 1020 item
    random sample of OCR post-corrected strings from the full collection. Both evaluations have their own stories to tell and lessons to teach.
    Original languageEnglish
    Title of host publicationProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
    Editors Calzolari
    PublisherELRA
    Pages967-974
    Number of pages8
    Publication statusPublished - 2016
    EventInternational Conference on Language Resources and Evaluation 2016: 10th edition - Grand Hotel Bernardin Conference Center, Portoroz, Slovenia
    Duration: 23 May 201628 May 2016
    Conference number: 10
    http://lrec2016.lrec-conf.org/en/

    Conference

    ConferenceInternational Conference on Language Resources and Evaluation 2016
    Abbreviated titleLREC 2016
    Country/TerritorySlovenia
    CityPortoroz
    Period23/05/1628/05/16
    Internet address

    Keywords

    • TICCL
    • OCR post-correction
    • evaluation
    • EDBO
    • Nederlab
    • CLARIAH

    Fingerprint

    Dive into the research topics of 'OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited'. Together they form a unique fingerprint.

    Cite this