WhiteLab 2.0

A web interface for corpus exploitation

Matje van de Camp, Martin Reynaert, Nelleke Oostdijk

    Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

    Abstract

    The OpenSoNaR-CGN project set out to develop WhiteLab 2.0 for the online exploitation of the SoNaR-500 and CGN corpora. Important changes in comparison to the first version of WhiteLab are the addition of audio support and support for multiple corpora. The web interface has been redeveloped and adapted to accommodate these changes. At the backend, WhiteLab 2.0 comes with a new data importer and plugin for Neo4j, while also remaining compatible with BlackLab. Although performance of the new backend is not yet up to par with BlackLab, the investment in new technology that will likely be further developed is expected to make the application more future-proof and a great addition to the set of tools available to the humanities.
    Original languageEnglish
    Title of host publicationCLARIN-NL in the Low Countries
    EditorsJan Odijk, Arjan van Hessen
    Place of PublicationLondon
    PublisherUbiquity Press, London
    Chapter19
    Pages231-243
    Number of pages12
    ISBN (Electronic)9781911529255
    ISBN (Print)9781911529248
    DOIs
    Publication statusPublished - 28 Dec 2017

    Keywords

    • Computer Sciences
    • Computers and the Humanities
    • Language and literature
    • Linguistics
    • Online corpora
    • Dutch
    • written language
    • BlackLab

    Cite this

    van de Camp, M., Reynaert, M., & Oostdijk, N. (2017). WhiteLab 2.0: A web interface for corpus exploitation. In J. Odijk, & A. van Hessen (Eds.), CLARIN-NL in the Low Countries (pp. 231-243). London: Ubiquity Press, London. https://doi.org/10.5334/bbi.19
    van de Camp, Matje ; Reynaert, Martin ; Oostdijk, Nelleke. / WhiteLab 2.0 : A web interface for corpus exploitation. CLARIN-NL in the Low Countries. editor / Jan Odijk ; Arjan van Hessen. London : Ubiquity Press, London, 2017. pp. 231-243
    @inbook{8b8f9b41101e438cb6caa4198d78cff7,
    title = "WhiteLab 2.0: A web interface for corpus exploitation",
    abstract = "The OpenSoNaR-CGN project set out to develop WhiteLab 2.0 for the online exploitation of the SoNaR-500 and CGN corpora. Important changes in comparison to the first version of WhiteLab are the addition of audio support and support for multiple corpora. The web interface has been redeveloped and adapted to accommodate these changes. At the backend, WhiteLab 2.0 comes with a new data importer and plugin for Neo4j, while also remaining compatible with BlackLab. Although performance of the new backend is not yet up to par with BlackLab, the investment in new technology that will likely be further developed is expected to make the application more future-proof and a great addition to the set of tools available to the humanities.",
    keywords = "Computer Sciences, Computers and the Humanities, Language and literature, Linguistics, Online corpora, Dutch, written language, BlackLab",
    author = "{van de Camp}, Matje and Martin Reynaert and Nelleke Oostdijk",
    year = "2017",
    month = "12",
    day = "28",
    doi = "10.5334/bbi.19",
    language = "English",
    isbn = "9781911529248",
    pages = "231--243",
    editor = "Jan Odijk and {van Hessen}, Arjan",
    booktitle = "CLARIN-NL in the Low Countries",
    publisher = "Ubiquity Press, London",

    }

    van de Camp, M, Reynaert, M & Oostdijk, N 2017, WhiteLab 2.0: A web interface for corpus exploitation. in J Odijk & A van Hessen (eds), CLARIN-NL in the Low Countries. Ubiquity Press, London, London, pp. 231-243. https://doi.org/10.5334/bbi.19

    WhiteLab 2.0 : A web interface for corpus exploitation. / van de Camp, Matje; Reynaert, Martin; Oostdijk, Nelleke.

    CLARIN-NL in the Low Countries. ed. / Jan Odijk; Arjan van Hessen. London : Ubiquity Press, London, 2017. p. 231-243.

    Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

    TY - CHAP

    T1 - WhiteLab 2.0

    T2 - A web interface for corpus exploitation

    AU - van de Camp, Matje

    AU - Reynaert, Martin

    AU - Oostdijk, Nelleke

    PY - 2017/12/28

    Y1 - 2017/12/28

    N2 - The OpenSoNaR-CGN project set out to develop WhiteLab 2.0 for the online exploitation of the SoNaR-500 and CGN corpora. Important changes in comparison to the first version of WhiteLab are the addition of audio support and support for multiple corpora. The web interface has been redeveloped and adapted to accommodate these changes. At the backend, WhiteLab 2.0 comes with a new data importer and plugin for Neo4j, while also remaining compatible with BlackLab. Although performance of the new backend is not yet up to par with BlackLab, the investment in new technology that will likely be further developed is expected to make the application more future-proof and a great addition to the set of tools available to the humanities.

    AB - The OpenSoNaR-CGN project set out to develop WhiteLab 2.0 for the online exploitation of the SoNaR-500 and CGN corpora. Important changes in comparison to the first version of WhiteLab are the addition of audio support and support for multiple corpora. The web interface has been redeveloped and adapted to accommodate these changes. At the backend, WhiteLab 2.0 comes with a new data importer and plugin for Neo4j, while also remaining compatible with BlackLab. Although performance of the new backend is not yet up to par with BlackLab, the investment in new technology that will likely be further developed is expected to make the application more future-proof and a great addition to the set of tools available to the humanities.

    KW - Computer Sciences

    KW - Computers and the Humanities

    KW - Language and literature

    KW - Linguistics

    KW - Online corpora

    KW - Dutch

    KW - written language

    KW - BlackLab

    U2 - 10.5334/bbi.19

    DO - 10.5334/bbi.19

    M3 - Chapter

    SN - 9781911529248

    SP - 231

    EP - 243

    BT - CLARIN-NL in the Low Countries

    A2 - Odijk, Jan

    A2 - van Hessen, Arjan

    PB - Ubiquity Press, London

    CY - London

    ER -

    van de Camp M, Reynaert M, Oostdijk N. WhiteLab 2.0: A web interface for corpus exploitation. In Odijk J, van Hessen A, editors, CLARIN-NL in the Low Countries. London: Ubiquity Press, London. 2017. p. 231-243 https://doi.org/10.5334/bbi.19