WhiteLab 2.0: A web interface for corpus exploitation

Matje van de Camp, Martin Reynaert, Nelleke Oostdijk

    Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

    Abstract

    The OpenSoNaR-CGN project set out to develop WhiteLab 2.0 for the online exploitation of the SoNaR-500 and CGN corpora. Important changes in comparison to the first version of WhiteLab are the addition of audio support and support for multiple corpora. The web interface has been redeveloped and adapted to accommodate these changes. At the backend, WhiteLab 2.0 comes with a new data importer and plugin for Neo4j, while also remaining compatible with BlackLab. Although performance of the new backend is not yet up to par with BlackLab, the investment in new technology that will likely be further developed is expected to make the application more future-proof and a great addition to the set of tools available to the humanities.
    Original languageEnglish
    Title of host publicationCLARIN-NL in the Low Countries
    EditorsJan Odijk, Arjan van Hessen
    Place of PublicationLondon
    PublisherUbiquity Press, London
    Chapter19
    Pages231-243
    Number of pages12
    ISBN (Electronic)9781911529255
    ISBN (Print)9781911529248
    DOIs
    Publication statusPublished - 28 Dec 2017

    Keywords

    • Computer Sciences
    • Computers and the Humanities
    • Language and literature
    • Linguistics
    • Online corpora
    • Dutch
    • written language
    • BlackLab

    Cite this

    van de Camp, M., Reynaert, M., & Oostdijk, N. (2017). WhiteLab 2.0: A web interface for corpus exploitation. In J. Odijk, & A. van Hessen (Eds.), CLARIN-NL in the Low Countries (pp. 231-243). Ubiquity Press, London. https://doi.org/10.5334/bbi.19