Synthetic Disinformation Attacks on Automated Fact Verification Systems

被引:0
|
作者
Du, Yibing [1 ]
Bosselut, Antoine [2 ]
Manning, Christopher D. [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated fact-checking is a needed technology to curtail the spread of online misinformation. One current framework for such solutions proposes to verify claims by retrieving supporting or refuting evidence from related textual sources. However, the realistic use cases for fact-checkers will require verifying claims against evidence sources that could be affected by the same misinformation. Furthermore, the development of modern NLP tools that can produce coherent, fabricated content would allow malicious actors to systematically generate adversarial disinformation for fact-checkers. In this work, we explore the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings: ADVERSARIAL ADDITION, where we fabricate documents and add them to the evidence repository available to the fact-checking system, and ADVERSARIAL MODIFICATION, where existing evidence source documents in the repository are automatically altered. Our study across multiple models on three benchmarks demonstrates that these systems suffer significant performance drops against these attacks. Finally, we discuss the growing threat of modern NLG systems as generators of disinformation in the context of the challenges they pose to automated fact-checkers.
引用
收藏
页码:10581 / 10589
页数:9
相关论文
共 50 条
  • [1] Explainability of Automated Fact Verification Systems: A Comprehensive Review
    Vallayil, Manju
    Nand, Parma
    Yan, Wei Qi
    Allende-Cid, Hector
    APPLIED SCIENCES-BASEL, 2023, 13 (23):
  • [2] Evaluating adversarial attacks against multiple fact verification systems
    Thorne, James
    Vlachos, Andreas
    Christodoulopoulos, Christos
    Mittal, Arpit
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2944 - 2953
  • [3] Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
    Abdelnabi, Sahar
    Fritz, Mario
    PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 6719 - 6736
  • [4] Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification
    Subramanian, Shyam
    Lee, Kyumin
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7798 - 7809
  • [5] Development of Disinformation Verification System with Criminal Record Based on Previous Systems
    Chen, Mu-Chuan
    Lin, I. -Long
    Yang, Hung-Cheng
    SENSORS AND MATERIALS, 2024, 36 (06) : 2265 - 2274
  • [6] Preventing Replay Attacks on Speaker Verification Systems
    Villalba, Jesus
    Lleida, Eduardo
    2011 IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2011,
  • [7] Evaluation of direct attacks to fingerprint verification systems
    Galbally, J.
    Fierrez, J.
    Alonso-Fernandez, F.
    Martinez-Diaz, M.
    TELECOMMUNICATION SYSTEMS, 2011, 47 (3-4) : 243 - 254
  • [8] Evaluation of direct attacks to fingerprint verification systems
    J. Galbally
    J. Fierrez
    F. Alonso-Fernandez
    M. Martinez-Diaz
    Telecommunication Systems, 2011, 47 : 243 - 254
  • [9] Traffic networks are vulnerable to disinformation attacks
    Waniek, Marcin
    Raman, Gururaghav
    AlShebli, Bedoor
    Peng, Jimmy Chih-Hsien
    Rahwan, Talal
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [10] Traffic networks are vulnerable to disinformation attacks
    Marcin Waniek
    Gururaghav Raman
    Bedoor AlShebli
    Jimmy Chih-Hsien Peng
    Talal Rahwan
    Scientific Reports, 11