TY - JOUR
T1 - You Sound Like an Evil Young Man
T2 - A Distributional Semantic Analysis of Systematic Form-meaning Associations for Polarity, Gender, and Age in Fictional Characters' Names
AU - Joosse, Aäron Yme
AU - Kuşcu, Gökçe
AU - Cassani, Giovanni
PY - 2024/2/4
Y1 - 2024/2/4
N2 - We detail a successful attempt in modeling associations about the age, gender, and polarity of fictional characters based on their names alone. We started by collecting ratings through an online survey on a sample of annotated names from young adult, children, and fan fiction stories. We collected ratings over three semantic differentials (gender: male - female; age: old - young; polarity: evil - good) using a slider bar. First, we show that participants tend to agree with authors: names judged to better suit female/young/evil characters tend to be assigned to female/young/evil characters in the original stories. We then show that, in a series of computational studies, we can predict participants' ratings on the three attributes using a distributional semantic model which derives representations for both lexical and sub-lexical patterns. This attempt was successful for all names, including made-up ones, and using both a supervised and an unsupervised approach. The prediction supported by distributed representations is much better than that afforded by symbolic features such as letters and phonological features, also when accounting for the complexity of the feature spaces. Our results show that people interpret both known and novel names relying on lexical and sub-lexical patterns, which suggests the availability of systematic form-meaning mappings in everyday language use. This further lends credit to the hypothesis that language internal statistics can support systematic form-meaning associations which apply to both known and novel lexical items.
AB - We detail a successful attempt in modeling associations about the age, gender, and polarity of fictional characters based on their names alone. We started by collecting ratings through an online survey on a sample of annotated names from young adult, children, and fan fiction stories. We collected ratings over three semantic differentials (gender: male - female; age: old - young; polarity: evil - good) using a slider bar. First, we show that participants tend to agree with authors: names judged to better suit female/young/evil characters tend to be assigned to female/young/evil characters in the original stories. We then show that, in a series of computational studies, we can predict participants' ratings on the three attributes using a distributional semantic model which derives representations for both lexical and sub-lexical patterns. This attempt was successful for all names, including made-up ones, and using both a supervised and an unsupervised approach. The prediction supported by distributed representations is much better than that afforded by symbolic features such as letters and phonological features, also when accounting for the complexity of the feature spaces. Our results show that people interpret both known and novel names relying on lexical and sub-lexical patterns, which suggests the availability of systematic form-meaning mappings in everyday language use. This further lends credit to the hypothesis that language internal statistics can support systematic form-meaning associations which apply to both known and novel lexical items.
KW - Form-meaning mappings
KW - Arbitrariness of the Linguistic sign
KW - Computational Psycholinguistics
KW - Digital Humanities
KW - Distributional semantics
KW - Names
U2 - 10.31234/osf.io/wdsur
DO - 10.31234/osf.io/wdsur
M3 - Article
SN - 0278-7393
JO - Journal of Experimental Psychology-Learning Memory and Cognition
JF - Journal of Experimental Psychology-Learning Memory and Cognition
ER -