Identification of ChatGPT-Generated Abstracts Within Shoulder and Elbow Surgery Poses a Challenge for Reviewers

被引:0
|
作者
Stadler, Ryan D. [1 ]
Sudah, Suleiman Y. [2 ]
Moverman, Michael A. [3 ]
Denard, Patrick J. [4 ]
Duralde, Xavier A. [5 ]
Garrigues, Grant E. [6 ]
Klifto, Christopher S. [7 ]
Levy, Jonathan C. [8 ]
Namdari, Surena [9 ]
Sanchez-Sotelo, Joaquin [10 ]
Menendez, Mariano E. [11 ]
机构
[1] Rutgers Robert Wood Johnson Med Sch, 125 Paterson St, New Brunswick, NJ 08901 USA
[2] Monmouth Med Ctr, Dept Orthopaed Surg, Monmouth Jct, NJ USA
[3] Univ Utah, Sch Med, Dept Orthopaed, Salt Lake City, UT USA
[4] Oregon Shoulder Inst, Medford, OR USA
[5] Peachtree Orthoped, Atlanta, GA USA
[6] Rush Univ, Med Ctr, Midwest Orthopaeat, Chicago, IL USA
[7] Duke Univ, Sch Med, Dept Orthopaed Surg, Durham, NC USA
[8] Paley Orthoped & Spine Inst, Levy Shoulder Ctr, Boca Raton, FL USA
[9] Thomas Jefferson Univ Hosp, Rothman Orthopaed Inst, Philadelphia, PA USA
[10] Mayo Clin, Dept Orthoped Surg, Rochester, MN USA
[11] Univ Calif Davis, Dept Orthopaed, Sacramento, CA USA
关键词
D O I
10.1016/j.arthro.2024.06.045
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Purpose: To evaluate the extent to which experienced reviewers can accurately discern between artificial intelligence (AI)egenerated and original research abstracts published in the field of shoulder and elbow surgery and compare this with the performance of an AI detection tool. Methods: Twenty-five shoulder- and elbow-related articles published in highimpact journals in 2023 were randomly selected. ChatGPT was prompted with only the abstract title to create an AIgenerated version of each abstract. The resulting 50 abstracts were randomly distributed to and evaluated by 8 blinded peer reviewers with at least 5 years of experience. Reviewers were tasked with distinguishing between original and AIgenerated text. A Likert scale assessed reviewer confidence for each interpretation, and the primary reason guiding assessment of generated text was collected. AI output detector (0%-100%) and plagiarism (0%-100%) scores were evaluated using GPTZero. Results: Reviewers correctly identified 62% of AI-generated abstracts and misclassified 38% of original abstracts as being AI generated. GPTZero reported a significantly higher probability of AI output among generated abstracts (median, 56%; interquartile range [IQR], 51%-77%) compared with original abstracts (median, 10%; IQR, 4%37%; P < .01). Generated abstracts scored significantly lower on the plagiarism detector (median, 7%; IQR, 5%-14%) relative to original abstracts (median, 82%; IQR, 72%-92%; P < .01). Correct identification of AI-generated abstracts was predominately attributed to the presence of unrealistic data/values. The primary reason for misidentifying original abstracts as AI was attributed to writing style. Conclusions: Experienced reviewers faced difficulties in distinguishing between human and AI-generated research content within shoulder and elbow surgery. The presence of unrealistic data facilitated correct identification of AI abstracts, whereas misidentification of original abstracts was often ascribed to writing style. Clinical Relevance: With rapidly increasing AI advancements, it is paramount that ethical standards of scientific reporting are upheld. It is therefore helpful to understand the ability of reviewers to identify AI-generated content.
引用
收藏
页数:11
相关论文
共 21 条
  • [1] A Study on Distinguishing ChatGPT-Generated and Human-Written Orthopaedic Abstracts by Reviewers: Decoding the Discrepancies
    Makiev, Konstantinos G.
    Asimakidou, Maria
    Vasios, Ioannis S.
    Keskinis, Anthimos
    Petkidis, Georgios
    Tilkeridis, Konstantinos
    Ververidis, Athanasios
    Iliopoulos, Efthymios
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (11)
  • [2] Human vs machine: identifying ChatGPT-generated abstracts in Gynecology and Urogynecology
    Pan, Evelyn T.
    Florian-Rodriguez, Maria
    AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2024, 231 (02) : 276e1 - 276e10
  • [3] Assessing Variability in the Readability of ChatGPT-Generated Education Material in Surgery
    Abdullah, Abiha
    Maze, Karleigh J.
    Brock, Bethany A.
    Smith, Burkely
    Chu, Daniel I.
    Jones, Bayley
    Wood, Lauren
    Giri, Oviya A.
    Rubyan, Michael
    Morris, Melanie
    JOURNAL OF THE AMERICAN COLLEGE OF SURGEONS, 2023, 237 (05) : S97 - S97
  • [4] Evaluating human ability to distinguish between ChatGPT-generated and original scientific abstracts
    Nabata, Kylie J.
    Alshehri, Yasir
    Mashat, Abdullah
    Wiseman, Sam M.
    UPDATES IN SURGERY, 2025,
  • [5] Man vs. machine: identifying chatGPT-generated abstracts in gynecology and urogynecology
    Pan, E.
    Florian-Rodriguez, M.
    AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2024, 230 (04) : S1151 - S1152
  • [6] Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers
    Catherine A. Gao
    Frederick M. Howard
    Nikolay S. Markov
    Emma C. Dyer
    Siddhi Ramesh
    Yuan Luo
    Alexander T. Pearson
    npj Digital Medicine, 6
  • [7] Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers
    Gao, Catherine A.
    Howard, Frederick M.
    Markov, Nikolay S.
    Dyer, Emma C.
    Ramesh, Siddhi
    Luo, Yuan
    Pearson, Alexander T.
    NPJ DIGITAL MEDICINE, 2023, 6 (01)
  • [8] Humans-written versus ChatGPT-generated abstracts: beyond the discussion on "who wrote it"
    Matsubara, Shigeki
    UPDATES IN SURGERY, 2025,
  • [9] Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery
    NIEVES-LOPEz, Benjamin
    Bechtle, Alexandra R.
    Traverse, Jennifer
    Klifto, Christopher
    Schoch, Bradley S.
    Aziz, Keith T.
    ORTHOPEDICS, 2025, 48 (02) : e69 - e74
  • [10] Accuracy of ChatGPT-Generated Information on Head and Neck and Oromaxillofacial Surgery: A Multicenter Collaborative Analysis
    Vaira, Luigi Angelo
    Lechien, Jerome R.
    Abbate, Vincenzo
    Allevi, Fabiana
    Audino, Giovanni
    Beltramini, Giada Anna
    Bergonzani, Michela
    Bolzoni, Alessandro
    Committeri, Umberto
    Crimi, Salvatore
    Gabriele, Guido
    Lonardi, Fabio
    Maglitto, Fabio
    Petrocelli, Marzia
    Pucci, Resi
    Saponaro, Gianmarco
    Tel, Alessandro
    Vellone, Valentino
    Chiesa-Estomba, Carlos Miguel
    Boscolo-Rizzo, Paolo
    Salzano, Giovanni
    De Riu, Giacomo
    OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 170 (06) : 1492 - 1503