SeqXGPT: Sentence-Level AI-Generated Text Detection

被引:0
|
作者
Wang, Pengyu
Li, Linyang
Ren, Ke
Jiang, Botian
Zhang, Dong
Qiu, Xipeng [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Widely applied large language models (LLMs) can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose Sequence X (Check) GPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like waves in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.(1)
引用
收藏
页码:1144 / 1156
页数:13
相关论文
共 50 条
  • [21] Frame Semantic-Enhanced Sentence Modeling for Sentence-level Extractive Text Summarization
    Guan, Yong
    Guo, Shaoru
    Li, Ru
    Li, Xiaoli
    Tan, Hongye
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4045 - 4052
  • [22] Exploring AI-Generated text in student writing: How does AI help?
    Woo, David James
    Susanto, Hengky
    Yeung, Chi Ho
    Guo, Kai
    Fung, April Ka Yeng
    LANGUAGE LEARNING & TECHNOLOGY, 2024, 28 (02): : 183 - 209
  • [23] Risks and Benefits of AI-generated Text Summarization for Expert Level Content in Graduate Health Informatics
    Merine, Regina
    Purkayastha, Saptarshi
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 567 - 574
  • [24] Hierarchical RNN with Static Sentence-Level Attention for Text-Based Speaker Change Detection
    Meng, Zhao
    Mou, Lili
    Jin, Zhi
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2203 - 2206
  • [25] Investigating generative AI models and detection techniques: impacts of tokenization and dataset size on identification of AI-generated text
    Hua, Haowei
    Yao, Co-Jiayu
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [26] Perception of Ai-Generated Art: Text Analysis of Online Discussions
    S. Bosonogov
    A. Suvorova
    Journal of Mathematical Sciences, 2024, 285 (1) : 1 - 13
  • [27] Navigating the Ethical Terrain of AI-Generated Text Tools: A Review
    Abdelgadir Mohamed, Yasir
    Mohamed, Abdul Hakim H. M.
    Khanan, Akbar
    Bashir, Mohamed
    Adiel, Mousab A. E.
    Elsadig, Muawia A.
    IEEE ACCESS, 2024, 12 : 197061 - 197120
  • [28] Sentence-Level Sarcasm Detection in English and Filipino Tweets
    Samonte, Mary Jane C.
    Dollete, Carl Justine T.
    Capanas, Paolo Mikkael M.
    Flores, Maristela Louise C.
    Soriano, Caroline B.
    2018 4TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND BUSINESS ENGINEERING (ICIBE 2018), 2018, : 181 - 186
  • [29] GEMINI: Controlling The Sentence-Level Summary Style in Abstractive Text Summarization
    Bao, Guangsheng
    Ou, Zebin
    Zhang, Yue
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 831 - 842
  • [30] Zero-Shot Detection of AI-Generated Images
    Cozzolino, Davide
    Poggi, Giovanni
    Niessner, Matthias
    Verdoliva, Luisa
    COMPUTER VISION-ECCV 2024, PT XVIII, 2025, 15076 : 54 - 72