Hallucination in AI-generated financial literature reviews: evaluating bibliographic accuracy

被引:0
|
作者
Erdem, Orhan [1 ]
Hassett, Kristi [1 ]
Egriboyun, Feyzullah [2 ]
机构
[1] Univ North Texas, Adv Data Analyt Dept, 1155 Union Circle 310830, Denton, TX 76203 USA
[2] HULT Int Business Sch, Hult House East,35 Commercial Rd, London E1 1LD, England
关键词
Artificial intelligence; Chatbots; ChatGPT; Gemini; Hallucination; Large language model;
D O I
10.1007/s41060-025-00731-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We evaluate the reliability of three chatbots (ChatGPT-4o, o1-preview, and Gemini Advanced) in providing references on financial literature and employing novel methodologies. Alongside the conventional binary approach common in the literature, we develop a non-binary method incorporating degree of hallucination, and we also introduce an age index to assess how hallucination rates vary based on the recency of a topic. The study analyzes 150 citations for each chatbot across 15 financial topics. The results reveal significant differences in performance among the chatbots. ChatGPT-4o has a hallucination rate of 20.0%, while the o1-preview has a hallucination rate of 21.3%. In contrast, Gemini Advanced exhibits a significantly higher hallucination rate of 76.7%. While hallucination rates increase for more recent topics, this trend is not statistically significant for Gemini Advanced. These findings emphasize the importance of verifying chatbot-provided references, particularly in rapidly evolving fields.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Evaluating Descriptive Quality of AI-Generated Audio Using Image-Schemas
    Kamath, Purnima
    Li, Zhuoyao
    Gupta, Chitralekha
    Jaidka, Kokil
    Nanayakkara, Suranga
    Wyse, Lonce
    PROCEEDINGS OF 2023 28TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2023, 2023, : 621 - 632
  • [22] Evaluating AI-Generated Language as Models for Strategic Competence in English Language Teaching
    Nguyen, Phuong-Anh
    IAFOR JOURNAL OF EDUCATION, 2024, 12 (03)
  • [23] Interactive AI-Generated Virtual Instructors Enhance Learning Motivation and Engagement in Financial Education
    Prasongpongchai, Thanawit
    Pataranutaporn, Pat
    Kanapornchai, Chonnipa
    Lapapirojn, Auttasak
    Ouppaphan, Pichayoot
    Winson, Kavin
    Lertsutthiwong, Monchai
    Maes, Pattie
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, 2024, 2151 : 217 - 225
  • [24] Assessing the Accuracy and Reliability of AI-Generated Responses to Patient Questions Regarding Spine Surgery
    Kasthuri, Viknesh S.
    Glueck, Jacob
    Pham, Han
    Daher, Mohammad
    Balmaceno-Criss, Mariah
    Mcdonald, Christopher L.
    Diebo, Bassel G.
    Daniels, Alan H.
    JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 2024, 106 (12): : 1136 - 1142
  • [25] Learning Embodied Sound-Motion Mappings: Evaluating AI-Generated Dance Improvisation
    Wallace, Benedikte
    Martin, Charles P.
    Torresen, Jim
    Nymoen, Kristian
    C&C'21: PROCEEDINGS OF THE 13TH CONFERENCE ON CREATIVITY AND COGNITION, 2021,
  • [26] Evaluating the fidelity of AI-generated information on long-acting reversible contraceptive methods
    Riley, Grace
    Wang, Elizabeth
    Flynn, Camille
    Lopez, Ashley
    Sridhar, Aparna
    EUROPEAN JOURNAL OF CONTRACEPTION AND REPRODUCTIVE HEALTH CARE, 2025,
  • [27] The Promise and Pitfalls of AI-Generated Anatomical Images: Evaluating Midjourney for Aesthetic Surgery Applications
    Giovanni Buzzaccarini
    Rebecca Susanna Degliuomini
    Marco Borin
    Anastasia Fidanza
    Noemi Salmeri
    Luigi Schiraldi
    Pietro Giovanni Di Summa
    Franco Vercesi
    Valeria Stella Vanni
    Massimo Candiani
    Luca Pagliardini
    Aesthetic Plastic Surgery, 2024, 48 : 1874 - 1883
  • [28] The Promise and Pitfalls of AI-Generated Anatomical Images: Evaluating Midjourney for Aesthetic Surgery Applications
    Buzzaccarini, Giovanni
    Degliuomini, Rebecca Susanna
    Borin, Marco
    Fidanza, Anastasia
    Salmeri, Noemi
    Schiraldi, Luigi
    Di Summa, Pietro Giovanni
    Vercesi, Franco
    Vanni, Valeria Stella
    Candiani, Massimo
    Pagliardini, Luca
    AESTHETIC PLASTIC SURGERY, 2024, 48 (09) : 1874 - 1883
  • [29] Evaluating diagnostic content of AI-generated radiology reports of chest X-rays
    Babar, Zaheer
    van Laarhoven, Twan
    Zanzotto, Fabio Massimo
    Marchiori, Elena
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 116 (116)
  • [30] Diet Quality and Caloric Accuracy in AI-Generated Diet Plans: A Comparative Study Across Chatbots
    Kacar, Huesna Kaya
    Kacar, Omer Furkan
    Avery, Amanda
    NUTRIENTS, 2025, 17 (02)