Hallucination in AI-generated financial literature reviews: evaluating bibliographic accuracy

被引：0

作者：

Erdem, Orhan ^{[1
]}

Hassett, Kristi ^{[1
]}

Egriboyun, Feyzullah ^{[2
]}

机构：

[1] Univ North Texas, Adv Data Analyt Dept, 1155 Union Circle 310830, Denton, TX 76203 USA

[2] HULT Int Business Sch, Hult House East,35 Commercial Rd, London E1 1LD, England

来源：

INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS | 2025年

关键词：

Artificial intelligence; Chatbots; ChatGPT; Gemini; Hallucination; Large language model;

D O I：

10.1007/s41060-025-00731-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We evaluate the reliability of three chatbots (ChatGPT-4o, o1-preview, and Gemini Advanced) in providing references on financial literature and employing novel methodologies. Alongside the conventional binary approach common in the literature, we develop a non-binary method incorporating degree of hallucination, and we also introduce an age index to assess how hallucination rates vary based on the recency of a topic. The study analyzes 150 citations for each chatbot across 15 financial topics. The results reveal significant differences in performance among the chatbots. ChatGPT-4o has a hallucination rate of 20.0%, while the o1-preview has a hallucination rate of 21.3%. In contrast, Gemini Advanced exhibits a significantly higher hallucination rate of 76.7%. While hallucination rates increase for more recent topics, this trend is not statistically significant for Gemini Advanced. These findings emphasize the importance of verifying chatbot-provided references, particularly in rapidly evolving fields.

引用

页数：10

共 50 条

[21] Evaluating Descriptive Quality of AI-Generated Audio Using Image-Schemas
Kamath, Purnima
Li, Zhuoyao
Gupta, Chitralekha
Jaidka, Kokil
Nanayakkara, Suranga
Wyse, Lonce
PROCEEDINGS OF 2023 28TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2023, 2023, : 621 - 632
[22] Evaluating AI-Generated Language as Models for Strategic Competence in English Language Teaching
Nguyen, Phuong-Anh
IAFOR JOURNAL OF EDUCATION, 2024, 12 (03)
[23] Interactive AI-Generated Virtual Instructors Enhance Learning Motivation and Engagement in Financial Education
Prasongpongchai, Thanawit
Pataranutaporn, Pat
Kanapornchai, Chonnipa
Lapapirojn, Auttasak
Ouppaphan, Pichayoot
Winson, Kavin
Lertsutthiwong, Monchai
Maes, Pattie
ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, 2024, 2151 : 217 - 225
[24] Assessing the Accuracy and Reliability of AI-Generated Responses to Patient Questions Regarding Spine Surgery
Kasthuri, Viknesh S.
Glueck, Jacob
Pham, Han
Daher, Mohammad
Balmaceno-Criss, Mariah
Mcdonald, Christopher L.
Diebo, Bassel G.
Daniels, Alan H.
JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 2024, 106 (12): : 1136 - 1142
[25] Learning Embodied Sound-Motion Mappings: Evaluating AI-Generated Dance Improvisation
Wallace, Benedikte
Martin, Charles P.
Torresen, Jim
Nymoen, Kristian
C&C'21: PROCEEDINGS OF THE 13TH CONFERENCE ON CREATIVITY AND COGNITION, 2021,
[26] Evaluating the fidelity of AI-generated information on long-acting reversible contraceptive methods
Riley, Grace
Wang, Elizabeth
Flynn, Camille
Lopez, Ashley
Sridhar, Aparna
EUROPEAN JOURNAL OF CONTRACEPTION AND REPRODUCTIVE HEALTH CARE, 2025,
[27] The Promise and Pitfalls of AI-Generated Anatomical Images: Evaluating Midjourney for Aesthetic Surgery Applications
Giovanni Buzzaccarini
Rebecca Susanna Degliuomini
Marco Borin
Anastasia Fidanza
Noemi Salmeri
Luigi Schiraldi
Pietro Giovanni Di Summa
Franco Vercesi
Valeria Stella Vanni
Massimo Candiani
Luca Pagliardini
Aesthetic Plastic Surgery, 2024, 48 : 1874 - 1883
[28] The Promise and Pitfalls of AI-Generated Anatomical Images: Evaluating Midjourney for Aesthetic Surgery Applications
Buzzaccarini, Giovanni
Degliuomini, Rebecca Susanna
Borin, Marco
Fidanza, Anastasia
Salmeri, Noemi
Schiraldi, Luigi
Di Summa, Pietro Giovanni
Vercesi, Franco
Vanni, Valeria Stella
Candiani, Massimo
Pagliardini, Luca
AESTHETIC PLASTIC SURGERY, 2024, 48 (09) : 1874 - 1883
[29] Evaluating diagnostic content of AI-generated radiology reports of chest X-rays
Babar, Zaheer
van Laarhoven, Twan
Zanzotto, Fabio Massimo
Marchiori, Elena
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 116 (116)
[30] Diet Quality and Caloric Accuracy in AI-Generated Diet Plans: A Comparative Study Across Chatbots
Kacar, Huesna Kaya
Kacar, Omer Furkan
Avery, Amanda
NUTRIENTS, 2025, 17 (02)

← 1 2 3 4 5 →