TY - JOUR
T1 - PoeTree: Poetry Treebanks in Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian and Spanish
AU - Plecháč, Petr
AU - Cinková, Silvie
AU - Kolár, Robert
AU - Šeļa, Artjoms
AU - De Sisto, Mirella
AU - Nugues, Lara
AU - Haider, Thomas
AU - Kočnik, Neža
N1 - Publisher Copyright:
© 2024 Petr Plecháč et al.
PY - 2024/9
Y1 - 2024/9
N2 - This article presents a set of standardised corpora of poetry comprising over 330,000 poems in ten languages (Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian, and Spanish). Each corpus has been deduplicated, enriched with Universal Dependencies, provided with additional metadata, and converted into a unified json structure.
AB - This article presents a set of standardised corpora of poetry comprising over 330,000 poems in ten languages (Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian, and Spanish). Each corpus has been deduplicated, enriched with Universal Dependencies, provided with additional metadata, and converted into a unified json structure.
KW - poetry
KW - computational poetry
KW - corpus linguistics
KW - digital humanities
UR - http://www.scopus.com/inward/record.url?scp=85205121011&partnerID=8YFLogxK
U2 - 10.1163/24523666-bja10044
DO - 10.1163/24523666-bja10044
M3 - Article
SN - 2452-3666
JO - Research Data Journal for the Humanities and Social Sciences
JF - Research Data Journal for the Humanities and Social Sciences
ER -