SeqXGPT: Sentence-Level AI-Generated Text Detection

被引:0
|
作者
Wang, Pengyu
Li, Linyang
Ren, Ke
Jiang, Botian
Zhang, Dong
Qiu, Xipeng [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Widely applied large language models (LLMs) can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose Sequence X (Check) GPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like waves in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.(1)
引用
收藏
页码:1144 / 1156
页数:13
相关论文
共 50 条
  • [31] Sentence-Level Content Planning and Style Specification for Neural Text Generation
    Hua, Xinyu
    Wang, Lu
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 591 - 602
  • [32] Detection of AI-Generated Synthetic Images with a Lightweight CNN
    Ladevic, Adrian Lokner
    Kramberger, Tin
    Kramberger, Renata
    Vlahek, Dino
    AI, 2024, 5 (03) : 1575 - 1593
  • [33] Ai.llude: Encouraging Rewriting AI-Generated Text to Support Creative Expression
    Zhou, David
    Sterman, Sarah
    PROCEEDINGS OF THE 16TH CONFERENCE ON CREATIVITY AND COGNITION, C&C 2024, 2024, : 241 - 254
  • [34] Sentence combining: A sentence-level writing intervention
    Saddler, B
    READING TEACHER, 2005, 58 (05): : 468 - 471
  • [35] Evading Watermark based Detection of AI-Generated Content
    Jiang, Zhengyuan
    Zhang, Jinghuai
    Gong, Neil Zhenqiang
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1168 - 1181
  • [36] Sentence-Level Attachment Prediction
    Albakour, M-Dyaa
    Kruschwitz, Udo
    Lucas, Simon
    ADVANCES IN MULTIDISCIPLINARY RETRIEVAL, 2010, 6107 : 6 - 19
  • [37] The use of AI-generated text and scientific publishing: Issues and a way forward
    Poland, Gregory A.
    Kennedy, Richard B.
    VACCINE, 2023, 41 (28) : 4065 - 4066
  • [38] Towards AI-Generated Essay Classification Using Numerical Text Representation
    Krawczyk, Natalia
    Probierz, Barbara
    Kozak, Jan
    APPLIED SCIENCES-BASEL, 2024, 14 (21):
  • [39] Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense
    Krishna, Kalpesh
    Song, Yixiao
    Karpinska, Marzena
    Wieting, John
    Iyyer, Mohit
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [40] A Hybrid of Sentence-Level Approach and Fragment-Level Approach of Parallel Text Extraction from Comparable Text
    Yeong, Yin-Lai
    Tan, Tien-Ping
    Gan, Keng Hoon
    FIFTH INFORMATION SYSTEMS INTERNATIONAL CONFERENCE, 2019, 161 : 406 - 414