A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers

被引：8

作者：

Abdalla, Mohamed Hesham Ibrahim ^{[1
]}

Malberg, Simon ^{[1
]}

Dementieva, Daryna ^{[1
]}

Mosca, Edoardo ^{[1
]}

Groh, Georg ^{[1
]}

机构：

[1] Tech Univ Munich, Sch Computat Informat & Technol, D-80333 Munich, Germany

来源：

INFORMATION | 2023年 / 14卷 / 10期

关键词：

text generation; large language models; machine-generated text detection;

D O I：

10.3390/info14100522

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As generative NLP can now produce content nearly indistinguishable from human writing, it is becoming difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information in machine-generated text can be factually wrong or even entirely fabricated. In this work, we introduce a novel benchmark dataset containing human-written and machine-generated scientific papers from SCIgen, GPT-2, GPT-3, ChatGPT, and Galactica, as well as papers co-created by humans and ChatGPT. We also experiment with several types of classifiers-linguistic-based and transformer-based-for detecting the authorship of scientific text. A strong focus is put on generalization capabilities and explainability to highlight the strengths and weaknesses of these detectors. Our work makes an important step towards creating more robust methods for distinguishing between human-written and machine-generated scientific papers, ultimately ensuring the integrity of scientific literature.

引用

页数：33

共 50 条

[1] Real or Fake Text?: Investigating Human Ability to Detect Boundaries between Human-Written and Machine-Generated Text
Dugan, Liam
Ippolito, Daphne
Kirubarajan, Arun
Shi, Sherry
Callison-Burch, Chris
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12763 - 12771
[2] Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis
Zaitsu, Wataru
Jin, Mingzhe
arXiv, 2023,
[3] Quality and correctness of AI-generated versus human-written abstracts in psychiatric research papers
Hsu, Tien-Wei
Tseng, Ping-Tao
Tsai, Shih-Jen
Ko, Chih-Hung
Thompson, Trevor
Hsu, Chih-Wei
Yang, Fu-Chi
Tsai, Chia-Kuang
Tu, Yu-Kang
Yang, Szu-Nian
Liang, Chih-Sung
Su, Kuan-Pin
PSYCHIATRY RESEARCH, 2024, 341
[4] Perceptions of Human and Machine-Generated Articles
Tewari, Shubhra
Zabounidis, Renos
Kothari, Ammina
Bailey, Reynold
Alm, Cecilia Ovesdotter
DIGITAL THREATS: RESEARCH AND PRACTICE, 2021, 2 (02):
[5] A Comparison of Human and Machine-Generated Voice
Abdulrahman, Amal
Richards, Deborah
Bilgin, Ayse Aysin
25TH ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY (VRST 2019), 2019,
[6] Discrimination of human-written and human and machine written sentences using text consistency
Harada, Atsumu
Bollegala, Danushka
Chandrasiri, Naiwala P.
2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 41 - 47
[7] A comparison of ChatGPT-generated articles with human-written articles
Ariyaratne, Sisith
Iyengar, Karthikeyan. P.
Nischal, Neha
Chitti Babu, Naparla
Botchu, Rajesh
SKELETAL RADIOLOGY, 2023, 52 (09) : 1755 - 1758
[8] ChatGPT-generated articles and human-written articles: correspondence
Amnuay Kleebayoon
Viroj Wiwanitkit
Skeletal Radiology, 2023, 52 : 2493 - 2493
[9] A comparison of ChatGPT-generated articles with human-written articles
Sisith Ariyaratne
Karthikeyan. P. Iyengar
Neha Nischal
Naparla Chitti Babu
Rajesh Botchu
Skeletal Radiology, 2023, 52 : 1755 - 1758
[10] RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
Dugan, Liam
Hwang, Alyssa
Trhlik, Filip
Ludan, Josh Magnus
Zhu, Andrew
Xu, Hainiu
Ippolito, Daphne
Callison-Burch, Chris
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 12463 - 12492

← 1 2 3 4 5 →