OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    3 Citations (Scopus)
    165 Downloads (Pure)

    Abstract

    We present further work on evaluation of the fully automatic post-correction of Early Dutch Books Online, a collection of 10,333 18th century books. In prior work we evaluated the new implementation of Text-Induced Corpus Clean-up (TICCL) on the basis of a single book Gold Standard derived from this collection. In the current paper we revisit the same collection on the basis of a sizeable 1020 item
    random sample of OCR post-corrected strings from the full collection. Both evaluations have their own stories to tell and lessons to teach.
    Original languageEnglish
    Title of host publicationProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
    Editors Calzolari
    PublisherELRA
    Pages967-974
    Number of pages8
    Publication statusPublished - 2016
    EventInternational Conference on Language Resources and Evaluation 2016: 10th edition - Grand Hotel Bernardin Conference Center, Portoroz, Slovenia
    Duration: 23 May 201628 May 2016
    Conference number: 10
    http://lrec2016.lrec-conf.org/en/

    Conference

    ConferenceInternational Conference on Language Resources and Evaluation 2016
    Abbreviated titleLREC 2016
    CountrySlovenia
    CityPortoroz
    Period23/05/1628/05/16
    Internet address

    Keywords

    • TICCL
    • OCR post-correction
    • evaluation
    • EDBO
    • Nederlab
    • CLARIAH

    Fingerprint Dive into the research topics of 'OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited'. Together they form a unique fingerprint.

  • Research Output

    • 3 Citations
    • 1 Conference contribution

    Synergy of Nederlab and @PhilosTEI: diachronic and multilingual Text-Induced Corpus Clean-up

    Reynaert, M. W. C., 1 May 2014, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). al., N. C. E. (ed.). Reykjavik, Iceland: European Language Resources Association (ELRA), p. 1224-1230 7 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    Cite this

    Reynaert, M. (2016). OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited. In Calzolari (Ed.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 967-974). ELRA.