Mitigating spatial hallucination in large language models for path planning via prompt engineering

被引:0
|
作者
Zhang, Hongjie [1 ]
Deng, Hourui [1 ]
Ou, Jie [2 ]
Feng, Chaosheng [1 ]
机构
[1] Sichuan Normal Univ, Coll Comp Sci, Chengdu 610101, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
D O I
10.1038/s41598-025-93601-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Spatial reasoning in Large Language Models (LLMs) serves as a foundation for embodied intelligence. However, even in simple maze environments, LLMs often struggle to plan correct paths due to hallucination issues. To address this, we propose S2ERS, an LLM-based technique that integrates entity and relation extraction with the on-policy reinforcement learning algorithm Sarsa for optimal path planning. We introduce three key improvements: (1) To tackle the hallucination of spatial, we extract a graph structure of entities and relations from the text-based maze description, aiding LLMs in accurately comprehending spatial relationships. (2) To prevent LLMs from getting trapped in dead ends due to context inconsistency hallucination by long-term reasoning, we insert the state-action value function Q into the prompts, guiding the LLM's path planning. (3) To reduce the token consumption of LLMs, we utilize multi-step reasoning, dynamically inserting local Q-tables into the prompt to assist the LLM in outputting multiple steps of actions at once. Our comprehensive experimental evaluation, conducted using closed-source LLMs ChatGPT 3.5, ERNIE-Bot 4.0 and open-source LLM ChatGLM-6B, demonstrates that S2ERS significantly mitigates the spatial hallucination issues in LLMs, and improves the success rate and optimal rate by approximately 29% and 19%, respectively, in comparison to the SOTA CoT methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Knowledge graph construction for heart failure using large language models with prompt engineering
    Xu, Tianhan
    Gu, Yixun
    Xue, Mantian
    Gu, Renjie
    Li, Bin
    Gu, Xiang
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2024, 18
  • [22] How to use large language models in ophthalmology: from prompt engineering to protecting confidentiality
    Kleinig, Oliver
    Gao, Christina
    Kovoor, Joshua G.
    Gupta, Aashray K.
    Bacchi, Stephen
    Chan, Weng Onn
    EYE, 2024, 38 (04) : 649 - 653
  • [23] How to use large language models in ophthalmology: from prompt engineering to protecting confidentiality
    Oliver Kleinig
    Christina Gao
    Joshua G. Kovoor
    Aashray K. Gupta
    Stephen Bacchi
    Weng Onn Chan
    Eye, 2024, 38 : 649 - 653
  • [24] Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models
    Ma, Chengcheng
    Liu, Yang
    Deng, Jiankang
    Xie, Lingxi
    Dong, Weiming
    Xu, Changsheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4616 - 4629
  • [25] Chain-of-Verification Reduces Hallucination in Large Language Models
    Dhuliawala, Shehzaad
    Komeili, Mojtaba
    Xu, Jing
    Raileanu, Roberta
    Li, Xian
    Celikyilmaz, Asli
    Weston, Jason
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3563 - 3578
  • [26] Untangling Emotional Threads: Hallucination Networks of Large Language Models
    Goodarzi, Mahsa
    Venkatakrishnan, Radhakrishnan
    Canbaz, M. Abdullah
    COMPLEX NETWORKS & THEIR APPLICATIONS XII, VOL 1, COMPLEX NETWORKS 2023, 2024, 1141 : 202 - 214
  • [27] Evaluating Object Hallucination in Large Vision-Language Models
    Li, Yifan
    Du, Yifan
    Zhou, Kun
    Wang, Jinpeng
    Zhao, Wayne Xin
    Wen, Ji-Rong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 292 - 305
  • [28] Mitigating Exaggerated Safety in Large Language Models
    Ray, Ruchira
    Bhalani, Ruchi
    arXiv,
  • [29] Investigating Hallucination Tendencies of Large Language Models in Japanese and English
    Tsuruta, Hiromi
    Sakaguchi, Rio
    Research Square,
  • [30] Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
    Strobelt H.
    Webson A.
    Sanh V.
    Hoover B.
    Beyer J.
    Pfister H.
    Rush A.M.
    IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 1146 - 1156