The Use of Generative AI for Scientific Literature Searches for Systematic Reviews: ChatGPT and Microsoft Bing AI Performance Evaluation

被引:0
|
作者
Gwon, Yong Nam [1 ]
Kim, Jae Heon [1 ]
Chung, Hyun Soo [2 ]
Jung, Eun Jee [2 ]
Chun, Joey [1 ,3 ]
Lee, Serin [1 ,4 ]
Shim, Sung Ryul [5 ,6 ]
机构
[1] Soonchunhyang Univ, Seoul Hosp, Coll Med, Dept Urol, Seoul, South Korea
[2] Soonchunhyang Univ, Coll Med, Cheonan, South Korea
[3] Cranbrook Kingswood Upper Sch, Bloomfield Hills, MI USA
[4] Case Western Reserve Univ, Dept Biochem, Cleveland, OH USA
[5] Konyang Univ, Coll Med, Dept Biomed Informat, 158 Gwanjeodong Ro, Daejeon 35365, South Korea
[6] Konyang Univ Hosp, Konyang Med Data Res Grp, KYMERA, Daejeon, South Korea
关键词
artificial intelligence; search engine; systematic review; evidence-based medicine; ChatGPT; language model; education; tool; clinical decision support system; decision support; support; treatment;
D O I
10.2024/1/e51187
中图分类号
R-058 [];
学科分类号
摘要
Background: A large language model is a type of artificial intelligence (AI) model that opens up great possibilities for health care practice, research, and education, although scholars have emphasized the need to proactively address the issue of unvalidated and inaccurate information regarding its use. One of the best-known large language models is ChatGPT (OpenAI). It is believed to be of great help to medical research, as it facilitates more efficient data set analysis, code generation, and literature review, allowing researchers to focus on experimental design as well as drug discovery and development. Objective: This study aims to explore the potential of ChatGPT as a real -time literature search tool for systematic reviews and clinical decision support systems, to enhance their efficiency and accuracy in health care settings. Methods: The search results of a published systematic review by human experts on the treatment of Peyronie disease were selected as a benchmark, and the literature search formula of the study was applied to ChatGPT and Microsoft Bing AI as a comparison to human researchers. Peyronie disease typically presents with discomfort, curvature, or deformity of the penis in association with palpable plaques and erectile dysfunction. To evaluate the quality of individual studies derived from AI answers, we created a structured rating system based on bibliographic information related to the publications. We classified its answers into 4 grades if the title existed: A, B, C, and F. No grade was given for a fake title or no answer. Results: From ChatGPT, 7 (0.5%) out of 1287 identified studies were directly relevant, whereas Bing AI resulted in 19 (40%) relevant studies out of 48, compared to the human benchmark of 24 studies. In the qualitative evaluation, ChatGPT had 7 grade A, 18 grade B, 167 grade C, and 211 grade F studies, and Bing AI had 19 grade A and 28 grade C studies. Conclusions: This is the first study to compare AI and conventional human systematic review methods as a real -time literature collection tool for evidence-based medicine. The results suggest that the use of ChatGPT as a tool for real -time evidence generation is not yet accurate and feasible. Therefore, researchers should be cautious about using such AI. The limitations of this study using the generative pre-trained transformer model are that the search for research topics was not diverse and that it did not prevent the hallucination of generative AI. However, this study will serve as a standard for future studies by providing an index to verify the reliability and consistency of generative AI from a user's point of view. If the reliability and consistency of AI literature search services are verified, then the use of these technologies will help medical research greatly.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Generative AI-enhanced human-AI collaborative conceptual design: A systematic literature review
    Fang, Cong
    Zhu, Yujie
    Fang, Le
    Long, Yonghao
    Lin, Huan
    Cong, Yangfan
    Wang, Stephen Jia
    DESIGN STUDIES, 2025, 97
  • [22] Between tech and text: the use of generative AI in Palestinian universities - a ChatGPT case study
    Hamamra, Bilal
    Mayaleh, Asala
    Khlaif, Zuheir N.
    COGENT EDUCATION, 2024, 11 (01):
  • [23] Enhancing systematic literature reviews with generative artificial intelligence: development, applications, and performance evaluation
    Li, Ying
    Datta, Surabhi
    Rastegar-Mojarad, Majid
    Lee, Kyeryoung
    Paek, Hunki
    Glasgow, Julie
    Liston, Chris
    He, Long
    Wang, Xiaoyan
    Xu, Yingxin
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2025,
  • [24] Global insights and the impact of generative AI-ChatGPT on multidisciplinary: a systematic review and bibliometric analysis
    Khan, Nauman
    Khan, Zahid
    Koubaa, Anis
    Khan, Muhammad Khurram
    Salleh, Rosli bin
    CONNECTION SCIENCE, 2024, 36 (01)
  • [25] National Academies President on How to Use Generative AI Responsibly in Scientific Research
    Bibbins-Domingo, Kirsten
    Hswen, Yulin
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2024, 332 (21): : 1773 - 1775
  • [26] AI, diabetes and getting lost in translation: a multilingual evaluation of Bing with ChatGPT focused in HbA1c
    Barallat, Jaume
    Gomez, Carolina
    Sancho-Cerro, Ana
    CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2023, 61 (11) : E222 - E224
  • [27] Tapping generative AI capabilities: a study to examine continued intention to use ChatGPT in the travel planning
    Arora, Nupur
    Manchanda, Parul
    Aggarwal, Aanchal
    Maggo, Vanshika
    ASIA PACIFIC JOURNAL OF TOURISM RESEARCH, 2024,
  • [28] Generative AI-based predictive maintenance in aviation: a systematic literature review
    Zeeshan Ullah Khan
    Bisma Nasim
    Zeehasham Rasheed
    CEAS Aeronautical Journal, 2025, 16 (2) : 537 - 555
  • [29] Using Generative AI to Enhance Experiential Learning: An Exploratory Study of ChatGPT Use by University Students
    Sun, Rui
    Nancy Deng, Xuefei
    Journal of Information Systems Education, 2025, 36 (01) : 53 - 64
  • [30] NON-SYSTEMATIC LITERATURE REVIEWS: CAN AI ENHANCE CURRENT METHODS?
    Baisley, W.
    Perriello, L.
    Shoushi, G.
    Nguyen, K.
    Lahue, B.
    VALUE IN HEALTH, 2023, 26 (12) : S402 - S402