Predicting author profiles from online abuse directed at public figures

I. van der Vegt, B. Kleinberg, P. Gill

Research output: Contribution to journalArticleScientificpeer-review

Abstract

The problem of online threats and abuse directed at public figures could potentially be mitigated with a computational approach, where sources of abusive language are better understood or identified through author profiling. However, abusive language constitutes a specific domain of language that is untested on whether differences emerge based on personality, age, or gender of text authors. The present study presents a unique data set of 789 abusive messages directed at politicians. It examines statistical relationships between author demographics of text authors and (abusive) language, then uses a machine learning approach to predict personality, age, and gender based on language in the texts. Results showed that (a) personality traits could be determined within 10% of their actual value, (b) age was determined with an error margin of 10 years, and (c) gender was classified correctly in 70% of the cases. Even though we found statistically significant relationships between language use and demographics, prediction performance was poor when compared to previous research on author profiling. Therefore, we suggest that further research is needed before author profiling systems can be of significant value within the context of abusive language and threat assessment.
Original languageEnglish
Pages (from-to)17-32
JournalJournal of Threat Assessment and Management
Volume9
Issue number1
DOIs
Publication statusPublished - 1 Jan 2022

Fingerprint

Dive into the research topics of 'Predicting author profiles from online abuse directed at public figures'. Together they form a unique fingerprint.

Cite this