HILL: A Hallucination Identifier for Large Language Models

被引:2
|
作者
Leiser, Florian [1 ]
Eckhardt, Sven [2 ]
Leuthe, Valentin [1 ]
Knaeble, Merlin [3 ]
Maedche, Alexander [3 ]
Schwabe, Gerhard [2 ]
Sunyaev, Ali [1 ]
机构
[1] Karlsruhe Inst Technol, Inst Appl Informat & Formal Descript Methods, Karlsruhe, Germany
[2] Univ Zurich, Dept Informat, Zurich, Switzerland
[3] Karlsruhe Inst Technol, Human Ctr Syst Lab, Karlsruhe, Germany
关键词
ChatGPT; Large Language Models; Artificial Hallucinations; Wizard of Oz; Artifact Development; AUTOMATION; WIZARD; OZ;
D O I
10.1145/3613904.3642428
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) are prone to hallucinations, i.e., non-sensical, unfaithful, and undesirable text. Users tend to overrely on LLMs and corresponding hallucinations which can lead to misinterpretations and errors. To tackle the problem of overreliance, we propose HILL, the "Hallucination Identifier for Large Language Models". First, we identified design features for HILL with a Wizard of Oz approach with nine participants. Subsequently, we implemented HILL based on the identified design features and evaluated HILL's interface design by surveying 17 participants. Further, we investigated HILL's functionality to identify hallucinations based on an existing question-answering dataset and five user interviews. We find that HILL can correctly identify and highlight hallucinations in LLM responses which enables users to handle LLM responses with more caution. With that, we propose an easy-to-implement adaptation to existing LLMs and demonstrate the relevance of user-centered designs of AI artifacts.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Sources of Hallucination by Large Language Models on Inference Tasks
    McKenna, Nick
    Li, Tianyi
    Cheng, Liang
    Hosseini, Mohammad Javad
    Johnson, Mark
    Steedman, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2758 - 2774
  • [2] Woodpecker: hallucination correction for multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Xu, Tong
    Wang, Hao
    Sui, Dianbo
    Shen, Yunhang
    Li, Ke
    Sun, Xing
    Chen, Enhong
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (12)
  • [3] Woodpecker: hallucination correction for multimodal large language models
    Shukang YIN
    Chaoyou FU
    Sirui ZHAO
    Tong XU
    Hao WANG
    Dianbo SUI
    Yunhang SHEN
    Ke LI
    Xing SUN
    Enhong CHEN
    Science China(Information Sciences), 2024, 67 (12) : 52 - 64
  • [4] Mitigating Factual Inconsistency and Hallucination in Large Language Models
    Muneeswaran, I
    Shankar, Advaith
    Varun, V.
    Gopalakrishnan, Saisubramaniam
    Vaddina, Vishal
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1169 - 1170
  • [5] Chain-of-Verification Reduces Hallucination in Large Language Models
    Dhuliawala, Shehzaad
    Komeili, Mojtaba
    Xu, Jing
    Raileanu, Roberta
    Li, Xian
    Celikyilmaz, Asli
    Weston, Jason
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3563 - 3578
  • [6] Untangling Emotional Threads: Hallucination Networks of Large Language Models
    Goodarzi, Mahsa
    Venkatakrishnan, Radhakrishnan
    Canbaz, M. Abdullah
    COMPLEX NETWORKS & THEIR APPLICATIONS XII, VOL 1, COMPLEX NETWORKS 2023, 2024, 1141 : 202 - 214
  • [7] Evaluating Object Hallucination in Large Vision-Language Models
    Li, Yifan
    Du, Yifan
    Zhou, Kun
    Wang, Jinpeng
    Zhao, Wayne Xin
    Wen, Ji-Rong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 292 - 305
  • [8] Investigating Hallucination Tendencies of Large Language Models in Japanese and English
    Tsuruta, Hiromi
    Sakaguchi, Rio
    Research Square,
  • [9] HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
    Li, Junyi
    Cheng, Xiaoxue
    Zhao, Wayne Xin
    Nie, Jian-Yun
    Wen, Ji-Rong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6449 - 6464
  • [10] Hallucination Detection for Generative Large Language Models by Bayesian Sequential Estimation
    Wang, Xiaohua
    Yan, Yuliang
    Huang, Longtao
    Zheng, Xiaoqing
    Huang, Xuanjing
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15361 - 15371