Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Written language contains stylistic cues that can be exploited to automatically infer a variety of potentially sensitive author information. Adversarial stylometry intends to attack such models by rewriting an author’s text. Our re-search proposes several components to facilitate deployment of these adversarial attacks in the wild, where neither data nor target models are accessible. We introduce a transformer-based extension of a lexical replacement attack, and show it achieves high transferability when trained on a weakly labeled corpus--decreasing target model performance below chance. While not completely inconspicuous, our more successful attacks also prove notably less detectable by humans. Our framework therefore provides a promising direction for future privacy-preserving adversarial attacks.
Original languageEnglish
Title of host publicationProceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Place of PublicationKyiv, Ukraine
PublisherAssociation for Computational Linguistics
Pages2388-2402
Number of pages14
Publication statusPublished - 15 Apr 2021
EventThe 16th Conference of the European Chapter of the Association for Computational Linguistics - Kyiv, Ukraine
Duration: 19 Apr 202123 Apr 2021
Conference number: 16
https://2021.eacl.org/

Conference

ConferenceThe 16th Conference of the European Chapter of the Association for Computational Linguistics
Abbreviated titleEACL 2021
Country/TerritoryUkraine
CityKyiv
Period19/04/2123/04/21
Internet address

Fingerprint

Dive into the research topics of 'Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling'. Together they form a unique fingerprint.

Cite this