Abstract
One of the major challenges hampering the development of language technology
which targets sign languages is the extremely limited availability of good quality data geared towards machine learning and deep learning approaches. In this paper we introduce the NGT-Dutch Hotel Review Corpus (NGT-HoReCo), which addresses this issue by providing multi-modal parallel data in English, Dutch and Sign Language of the Netherlands (NGT). The corpus contains 297 hotel reviews in written English (21.464 words), translated into written Dutch (22.274 words) and into NGT videos (230,54 minutes). It is publicly available through the ELG and the CLARIN platforms.
which targets sign languages is the extremely limited availability of good quality data geared towards machine learning and deep learning approaches. In this paper we introduce the NGT-Dutch Hotel Review Corpus (NGT-HoReCo), which addresses this issue by providing multi-modal parallel data in English, Dutch and Sign Language of the Netherlands (NGT). The corpus contains 297 hotel reviews in written English (21.464 words), translated into written Dutch (22.274 words) and into NGT videos (230,54 minutes). It is publicly available through the ELG and the CLARIN platforms.
Original language | English |
---|---|
Title of host publication | Proceedings of the Second International Workshop on Automatic Translation for Signed and Spoken Languages |
Publisher | ACL Anthology |
Pages | 39-42 |
Number of pages | 4 |
ISBN (Print) | 978-989-33-0589-8 |
Publication status | Accepted/In press - 2023 |
Event | 2nd International Workshop on Automatic Translation for Signed and Spoken Languages - Tampere, Finland Duration: 15 Jun 2023 → 15 Jun 2023 https://sites.google.com/tilburguniversity.edu/at4ssl2023/ |
Workshop
Workshop | 2nd International Workshop on Automatic Translation for Signed and Spoken Languages |
---|---|
Country/Territory | Finland |
City | Tampere |
Period | 15/06/23 → 15/06/23 |
Internet address |
Keywords
- Sign languages
- Machine learning
- NGT-Dutch Hotel Review Corpus
- Dutch language