Abstract
Written language contains stylistic cues that can be exploited to automatically infer a variety of potentially sensitive author information. Adversarial stylometry intends to attack such models by rewriting an author’s text. Our re-search proposes several components to facilitate deployment of these adversarial attacks in the wild, where neither data nor target models are accessible. We introduce a transformer-based extension of a lexical replacement attack, and show it achieves high transferability when trained on a weakly labeled corpus--decreasing target model performance below chance. While not completely inconspicuous, our more successful attacks also prove notably less detectable by humans. Our framework therefore provides a promising direction for future privacy-preserving adversarial attacks.
Original language | English |
---|---|
Title of host publication | Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers |
Place of Publication | Kyiv, Ukraine |
Publisher | Association for Computational Linguistics |
Pages | 2388-2402 |
Number of pages | 14 |
DOIs | |
Publication status | Published - 15 Apr 2021 |
Event | The 16th Conference of the European Chapter of the Association for Computational Linguistics - Kyiv, Ukraine Duration: 19 Apr 2021 → 23 Apr 2021 Conference number: 16 https://2021.eacl.org/ |
Conference
Conference | The 16th Conference of the European Chapter of the Association for Computational Linguistics |
---|---|
Abbreviated title | EACL 2021 |
Country/Territory | Ukraine |
City | Kyiv |
Period | 19/04/21 → 23/04/21 |
Internet address |