Examining embedded lies through computational text analysis

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.
Original languageEnglish
Article number26482
Number of pages16
JournalScientific Reports
Volume15
Issue number1
DOIs
Publication statusPublished - 1 Jul 2025

Keywords

  • Deception
  • Embedded lies
  • Lying profile
  • Natural Language processing
  • Individual differences

Fingerprint

Dive into the research topics of 'Examining embedded lies through computational text analysis'. Together they form a unique fingerprint.

Cite this