Evaluating the usage of Text-To-Speech in K12 education

Laduona Dai, Veronika Kritskaia, Evelien van der Velden, Merel M. Jung, Marie Postma, Max M. Louwerse

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

With increased interest in the use of virtual avatars for educational purposes, there is a growing need for high-quality text-to-speech solutions. However, the effects of using synthesized speech in educational applications for younger listeners are still unclear as past findings have been inconsistent and most of them have been obtained in a lab setting with adult assessors. Next to that, it is unclear how much training material is needed for high quality speech synthesis. Particularly for low resource languages, the assumption that good quality synthesized speech requires substantial amounts of vocal recordings to train may be hindering the development of TTS-based solutions. In this study, we created four Dutch text-to-speech (TTS) models from different amounts of training material and evaluated the models in terms of voice perception and recall with K12 students in a classroom environment. Results showed that while the original human voice outperformed the synthesized voices in terms of the listening experience and knowledge test score, more hours of training material did not necessarily result in better outcomes suggesting that 10-15 hours of speech material might be sufficient for training a Dutch TTS. A weak positive correlation was found between listening experience and knowledge test performance, with the low listening effort being the most important factor. This outcome suggests that comprehensibility is likely the most important TTS feature for educational applications.
Original languageEnglish
Title of host publicationICEEL '22: Proceedings of the 2022 6th International Conference on Education and E-Learning
PublisherAssociation for Computing Machinery (ACM)
Pages182-188
Number of pages7
ISBN (Print)978-1-4503-9842-8
DOIs
Publication statusPublished - 21 Nov 2022
Event
ICEEL 2022: 2022 6th International Conference on Education and E-Learning
- Yamanashi , Japan
Duration: 21 Nov 202223 Nov 2022

Conference

Conference
ICEEL 2022: 2022 6th International Conference on Education and E-Learning
Country/TerritoryJapan
CityYamanashi
Period21/11/2223/11/22

Keywords

  • Text-to-speech
  • K12education

Fingerprint

Dive into the research topics of 'Evaluating the usage of Text-To-Speech in K12 education'. Together they form a unique fingerprint.

Cite this