MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

被引:0
|
作者
Macko, Dominik [1 ]
Moro, Robert [1 ]
Uchendu, Adaku [2 ,3 ]
Lucas, Jason Samuel [3 ]
Yamashita, Michiharu [3 ]
MatusPikuliak [1 ]
Srba, Ivan [1 ]
Le, Thai [4 ]
Lee, Dongwon [3 ]
Simko, Jakub [1 ]
Bielikova, Maria [1 ]
机构
[1] Kempelen Inst Intelligent Technol, Bratislava, Slovakia
[2] MIT, Lincoln Lab, Lexington, MA USA
[3] Penn State Univ, University Pk, PA 16802 USA
[4] Univ Mississippi, University, MS USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings. This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE(1), a novel benchmarking dataset for multilingual machine-generated text detection comprising of 74,081 authentic and machine-generated texts in 11 languages (ar, ca, cs, de, en, es, nl, pt, ru, uk, and zh) generated by 8 multilingual LLMs. Using this benchmark, we compare the performance of zero-shot (statistical and black-box) and fine-tuned detectors. Considering the multilinguality, we evaluate 1) how these detectors generalize to unseen languages (linguistically similar as well as dissimilar) and unseen LLMs and 2) whether the detectors improve their performance when trained on multiple languages.
引用
收藏
页码:9960 / 9987
页数:28
相关论文
共 50 条
  • [1] TOXIGEN: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
    Hartvigsen, Thomas
    Gabriel, Saadia
    Palangi, Hamid
    Sap, Maarten
    Ray, Dipankar
    Kamar, Ece
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3309 - 3326
  • [2] MAGE: Machine-generated Text Detection in the Wild
    Li, Yafu
    Li, Qintong
    Cui, Leyang
    Bi, Wei
    Wang, Zhilin
    Wang, Longyue
    Yang, Linyi
    Shi, Shuming
    Zhang, Yue
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 36 - 53
  • [3] Detection of Machine-Generated Text: Literature Survey
    University of Arkansas at Little Rock, United States
    arXiv,
  • [4] FbMultiLingMisinfo: Challenging Large-Scale Multilingual Benchmark for Misinformation Detection
    Barnabo, Giorgio
    Siciliano, Federico
    Castillo, Carlos
    Leonardi, Stefano
    Nakov, Preslav
    Martino, Giovanni Da San
    Silvestri, Fabrizio
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [5] SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
    Wang, Yuxia
    Mansurov, Jonibek
    Ivanov, Petar
    Su, Jinyan
    Shelmanov, Artem
    Tsvigun, Akim
    Afzal, Osama Mohammed
    Mahmoud, Tarek
    Puccetti, Giovanni
    Arnold, Thomas
    Whitehouse, Chenxi
    Aji, Alham Fikri
    Habash, Nizar
    Gurevych, Iryna
    Nakov, Preslav
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 2057 - 2079
  • [6] RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
    Dugan, Liam
    Hwang, Alyssa
    Trhlik, Filip
    Ludan, Josh Magnus
    Zhu, Andrew
    Xu, Hainiu
    Ippolito, Daphne
    Callison-Burch, Chris
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 12463 - 12492
  • [7] IMGTB: A Framework for Machine-Generated Text Detection Benchmarking
    Spiegel, Michal
    Macko, Dominik
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 172 - 179
  • [8] How interesting and coherent are the stories generated by a large-scale neural language model? Comparing human and automatic evaluations of machine-generated text
    Callan, Dominic
    Foster, Jennifer
    EXPERT SYSTEMS, 2023, 40 (06)
  • [9] RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text
    Dugan, Liam
    Ippolito, Daphne
    Kirubarajan, Arun
    Callison-Burch, Chris
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, 2020, : 189 - 196
  • [10] M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
    Wang, Yuxia
    Mansurov, Jonibek
    Ivanov, Petar
    Su, Jinyan
    Shelmanov, Artem
    Tsvigun, Akim
    Afzal, Osama Mohammed
    Mahmoud, Tarek
    Puccetti, Giovanni
    Arnold, Thomas
    Aji, Alham Fikri
    Habash, Nizar
    Gurevych, Iryna
    Nakov, Preslav
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3964 - 3992