Drowzee: Metamorphic Testing for Fact-Conflicting Hallucination Detection in Large Language Models

被引:0
|
作者
Li, Ningke [1 ]
Li, Yuekang [2 ]
Liu, Yi [3 ]
Shi, Ling [3 ]
Wan, Kailong [1 ]
Wang, Haoyu [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[2] Univ New South Wales, Kensington, Australia
[3] Nanyang Technol Univ, Nanyang, Singapore
来源
基金
国家重点研发计划;
关键词
Large Language Model; Hallucination; Software Testing;
D O I
10.1145/3689776
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Large language models (LLMs) have revolutionized language processing, but face critical challenges with security, privacy, and generating hallucinations - coherent but factually inaccurate outputs. A major issue is fact-conflicting hallucination (FCH), where LLMs produce content contradicting ground truth facts. Addressing FCH is difficult due to two key challenges: 1) Automatically constructing and updating benchmark datasets is hard, as existing methods rely on manually curated static benchmarks that cannot cover the broad, evolving spectrum of FCH cases. 2) Validating the reasoning behind LLM outputs is inherently difficult, especially for complex logical relations. To tackle these challenges, we introduce a novel logic-programming-aided metamorphic testing technique for FCH detection. We develop an extensive and extensible framework that constructs a comprehensive factual knowledge base by crawling sources like Wikipedia, seamlessly integrated into DROWZEE. Using logical reasoning rules, we transform and augment this knowledge into a large set of test cases with ground truth answers. We test LLMs on these cases through template-based prompts, requiring them to provide reasoned answers. To validate their reasoning, we propose two semantic-aware oracles that assess the similarity between the semantic structures of the LLM answers and ground truth. Our approach automatically generates useful test cases and identifies hallucinations across six LLMs within nine domains, with hallucination rates ranging from 24.7% to 59.8%. Key findings include LLMs struggling with temporal concepts, out-of-distribution knowledge, and lack of logical reasoning capabilities. The results show that logic-based test cases generated by DROWZEE effectively trigger and detect hallucinations. To further mitigate the identified FCHs, we explored model editing techniques, which proved effective on a small scale (with edits to fewer than 1000 knowledge pieces). Our findings emphasize the need for continued community efforts to detect and mitigate model hallucinations.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
    Chen, Xiang
    Song, Duanzheng
    Gui, Honghao
    Wang, Chenxi
    Zhang, Ningyu
    Jiang, Yong
    Huang, Fei
    Lyu, Chengfei
    Zhang, Dan
    Chen, Huajun
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 6216 - 6224
  • [2] Hallucination Detection for Generative Large Language Models by Bayesian Sequential Estimation
    Wang, Xiaohua
    Yan, Yuliang
    Huang, Longtao
    Zheng, Xiaoqing
    Huang, Xuanjing
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15361 - 15371
  • [3] Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
    Chen, Yuyan
    Fu, Qiang
    Yuan, Yichen
    Wen, Zhihao
    Fan, Ge
    Liu, Dayiheng
    Zhang, Dongmei
    Li, Zhixu
    Xiao, Yanghua
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 245 - 255
  • [4] HILL: A Hallucination Identifier for Large Language Models
    Leiser, Florian
    Eckhardt, Sven
    Leuthe, Valentin
    Knaeble, Merlin
    Maedche, Alexander
    Schwabe, Gerhard
    Sunyaev, Ali
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [5] Object Hallucination Detection in Large Vision Language Models via Evidential Conflict
    Liu, Zhekun
    Huang, Tao
    Wang, Rui
    Jing, Liping
    BELIEF FUNCTIONS: THEORY AND APPLICATIONS, BELIEF 2024, 2024, 14909 : 58 - 67
  • [6] Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing
    Tsigkanos, Christos
    Rani, Pooja
    Mueller, Sebastian
    Kehrer, Timo
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 678 - 682
  • [7] Sources of Hallucination by Large Language Models on Inference Tasks
    McKenna, Nick
    Li, Tianyi
    Cheng, Liang
    Hosseini, Mohammad Javad
    Johnson, Mark
    Steedman, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2758 - 2774
  • [8] Woodpecker: hallucination correction for multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Xu, Tong
    Wang, Hao
    Sui, Dianbo
    Shen, Yunhang
    Li, Ke
    Sun, Xing
    Chen, Enhong
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (12)
  • [9] Woodpecker: hallucination correction for multimodal large language models
    Shukang YIN
    Chaoyou FU
    Sirui ZHAO
    Tong XU
    Hao WANG
    Dianbo SUI
    Yunhang SHEN
    Ke LI
    Xing SUN
    Enhong CHEN
    Science China(Information Sciences), 2024, 67 (12) : 52 - 64
  • [10] Mitigating Factual Inconsistency and Hallucination in Large Language Models
    Muneeswaran, I
    Shankar, Advaith
    Varun, V.
    Gopalakrishnan, Saisubramaniam
    Vaddina, Vishal
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1169 - 1170