Signaling sarcasm

From hyperbole to hashtag

Florian Kunneman, Christine Liebrecht, Margot van Mulken, Antal van den Bosch*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

Abstract

To avoid a sarcastic message being understood in its unintended literal meaning, in micro-texts such as messages on Twitter.com sarcasm is often explicitly marked with a hashtag such as '#sarcasm'. We collected a training corpus of about 406 thousand Dutch tweets with hashtag synonyms denoting sarcasm. Assuming that the human labeling is correct (annotation of a sample indicates that about 90% of these tweets are indeed sarcastic), we train a machine learning classifier on the harvested examples, and apply it to a sample of a day's stream of 2.25 million Dutch tweets. Of the 353 explicitly marked tweets on this day, we detect 309(87%) with the hashtag removed. We annotate the top of the ranked list of tweets most likely to be sarcastic that do not have the explicit hashtag. 35% of the top-250 ranked tweets are indeed sarcastic. Analysis indicates that the use of hashtags reduces the further use of linguistic markers for signaling sarcasm, such as exclamations and intensifiers. We hypothesize that explicit markers such as hashtags are the digital extralinguistic equivalent of non-verbal expressions that people employ in live interaction when conveying sarcasm. Checking the consistency of our finding in a language from another language family, we observe that in French the hashtag '#sarcasme' has a similar polarity switching function, be it to a lesser extent. (C) 2014 Elsevier Ltd. All rights reserved.

Original languageEnglish
Pages (from-to)500-509
Number of pages10
JournalInformation Processing & Management
Volume51
Issue number4
DOIs
Publication statusPublished - Jul 2015
Externally publishedYes

Keywords

  • Social media
  • Automatic sentiment analysis
  • Opinion mining
  • Sarcasm
  • Verbal irony
  • VERBAL IRONY
  • FIGURATIVE LANGUAGE
  • MARKERS
  • VOICE
  • TONE

Cite this

Kunneman, Florian ; Liebrecht, Christine ; van Mulken, Margot ; van den Bosch, Antal. / Signaling sarcasm : From hyperbole to hashtag. In: Information Processing & Management. 2015 ; Vol. 51, No. 4. pp. 500-509.
@article{d49ffc87d2a8413cb07c119639980bbf,
title = "Signaling sarcasm: From hyperbole to hashtag",
abstract = "To avoid a sarcastic message being understood in its unintended literal meaning, in micro-texts such as messages on Twitter.com sarcasm is often explicitly marked with a hashtag such as '#sarcasm'. We collected a training corpus of about 406 thousand Dutch tweets with hashtag synonyms denoting sarcasm. Assuming that the human labeling is correct (annotation of a sample indicates that about 90{\%} of these tweets are indeed sarcastic), we train a machine learning classifier on the harvested examples, and apply it to a sample of a day's stream of 2.25 million Dutch tweets. Of the 353 explicitly marked tweets on this day, we detect 309(87{\%}) with the hashtag removed. We annotate the top of the ranked list of tweets most likely to be sarcastic that do not have the explicit hashtag. 35{\%} of the top-250 ranked tweets are indeed sarcastic. Analysis indicates that the use of hashtags reduces the further use of linguistic markers for signaling sarcasm, such as exclamations and intensifiers. We hypothesize that explicit markers such as hashtags are the digital extralinguistic equivalent of non-verbal expressions that people employ in live interaction when conveying sarcasm. Checking the consistency of our finding in a language from another language family, we observe that in French the hashtag '#sarcasme' has a similar polarity switching function, be it to a lesser extent. (C) 2014 Elsevier Ltd. All rights reserved.",
keywords = "Social media, Automatic sentiment analysis, Opinion mining, Sarcasm, Verbal irony, VERBAL IRONY, FIGURATIVE LANGUAGE, MARKERS, VOICE, TONE",
author = "Florian Kunneman and Christine Liebrecht and {van Mulken}, Margot and {van den Bosch}, Antal",
year = "2015",
month = "7",
doi = "10.1016/j.ipm.2014.07.006",
language = "English",
volume = "51",
pages = "500--509",
journal = "Information Processing & Management",
issn = "0306-4573",
publisher = "ELSEVIER SCI LTD",
number = "4",

}

Signaling sarcasm : From hyperbole to hashtag. / Kunneman, Florian; Liebrecht, Christine; van Mulken, Margot; van den Bosch, Antal.

In: Information Processing & Management, Vol. 51, No. 4, 07.2015, p. 500-509.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Signaling sarcasm

T2 - From hyperbole to hashtag

AU - Kunneman, Florian

AU - Liebrecht, Christine

AU - van Mulken, Margot

AU - van den Bosch, Antal

PY - 2015/7

Y1 - 2015/7

N2 - To avoid a sarcastic message being understood in its unintended literal meaning, in micro-texts such as messages on Twitter.com sarcasm is often explicitly marked with a hashtag such as '#sarcasm'. We collected a training corpus of about 406 thousand Dutch tweets with hashtag synonyms denoting sarcasm. Assuming that the human labeling is correct (annotation of a sample indicates that about 90% of these tweets are indeed sarcastic), we train a machine learning classifier on the harvested examples, and apply it to a sample of a day's stream of 2.25 million Dutch tweets. Of the 353 explicitly marked tweets on this day, we detect 309(87%) with the hashtag removed. We annotate the top of the ranked list of tweets most likely to be sarcastic that do not have the explicit hashtag. 35% of the top-250 ranked tweets are indeed sarcastic. Analysis indicates that the use of hashtags reduces the further use of linguistic markers for signaling sarcasm, such as exclamations and intensifiers. We hypothesize that explicit markers such as hashtags are the digital extralinguistic equivalent of non-verbal expressions that people employ in live interaction when conveying sarcasm. Checking the consistency of our finding in a language from another language family, we observe that in French the hashtag '#sarcasme' has a similar polarity switching function, be it to a lesser extent. (C) 2014 Elsevier Ltd. All rights reserved.

AB - To avoid a sarcastic message being understood in its unintended literal meaning, in micro-texts such as messages on Twitter.com sarcasm is often explicitly marked with a hashtag such as '#sarcasm'. We collected a training corpus of about 406 thousand Dutch tweets with hashtag synonyms denoting sarcasm. Assuming that the human labeling is correct (annotation of a sample indicates that about 90% of these tweets are indeed sarcastic), we train a machine learning classifier on the harvested examples, and apply it to a sample of a day's stream of 2.25 million Dutch tweets. Of the 353 explicitly marked tweets on this day, we detect 309(87%) with the hashtag removed. We annotate the top of the ranked list of tweets most likely to be sarcastic that do not have the explicit hashtag. 35% of the top-250 ranked tweets are indeed sarcastic. Analysis indicates that the use of hashtags reduces the further use of linguistic markers for signaling sarcasm, such as exclamations and intensifiers. We hypothesize that explicit markers such as hashtags are the digital extralinguistic equivalent of non-verbal expressions that people employ in live interaction when conveying sarcasm. Checking the consistency of our finding in a language from another language family, we observe that in French the hashtag '#sarcasme' has a similar polarity switching function, be it to a lesser extent. (C) 2014 Elsevier Ltd. All rights reserved.

KW - Social media

KW - Automatic sentiment analysis

KW - Opinion mining

KW - Sarcasm

KW - Verbal irony

KW - VERBAL IRONY

KW - FIGURATIVE LANGUAGE

KW - MARKERS

KW - VOICE

KW - TONE

U2 - 10.1016/j.ipm.2014.07.006

DO - 10.1016/j.ipm.2014.07.006

M3 - Article

VL - 51

SP - 500

EP - 509

JO - Information Processing & Management

JF - Information Processing & Management

SN - 0306-4573

IS - 4

ER -