Abstract
In this paper we describe the collection of a parallel corpus (in Dutch) and its use in a sentence compression tool with the intention to automatically generate subtitles for the deaf from transcripts of a television program. First, the collection of the corpus is described, together with the manipulations and transformations performed on that corpus. Second, a hybrid sentence compression tool is described together with its evaluation.
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th International Language Resources and Evaluation Conference (LREC 2004) |
Place of Publication | Lisbon |
Publisher | Unknown Publisher |
Pages | 231-234 |
Number of pages | 4 |
Publication status | Published - 2004 |