Mitigating spatial hallucination in large language models for path planning via prompt engineering

被引:0
|
作者
Zhang, Hongjie [1 ]
Deng, Hourui [1 ]
Ou, Jie [2 ]
Feng, Chaosheng [1 ]
机构
[1] Sichuan Normal Univ, Coll Comp Sci, Chengdu 610101, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
D O I
10.1038/s41598-025-93601-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Spatial reasoning in Large Language Models (LLMs) serves as a foundation for embodied intelligence. However, even in simple maze environments, LLMs often struggle to plan correct paths due to hallucination issues. To address this, we propose S2ERS, an LLM-based technique that integrates entity and relation extraction with the on-policy reinforcement learning algorithm Sarsa for optimal path planning. We introduce three key improvements: (1) To tackle the hallucination of spatial, we extract a graph structure of entities and relations from the text-based maze description, aiding LLMs in accurately comprehending spatial relationships. (2) To prevent LLMs from getting trapped in dead ends due to context inconsistency hallucination by long-term reasoning, we insert the state-action value function Q into the prompts, guiding the LLM's path planning. (3) To reduce the token consumption of LLMs, we utilize multi-step reasoning, dynamically inserting local Q-tables into the prompt to assist the LLM in outputting multiple steps of actions at once. Our comprehensive experimental evaluation, conducted using closed-source LLMs ChatGPT 3.5, ERNIE-Bot 4.0 and open-source LLM ChatGLM-6B, demonstrates that S2ERS significantly mitigates the spatial hallucination issues in LLMs, and improves the success rate and optimal rate by approximately 29% and 19%, respectively, in comparison to the SOTA CoT methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] <monospace>BadCodePrompt</monospace>: backdoor attacks against prompt engineering of large language models for code generation
    Qu, Yubin
    Huang, Song
    Li, Yanzhou
    Bai, Tongtong
    Chen, Xiang
    Wang, Xingya
    Li, Long
    Yao, Yongming
    AUTOMATED SOFTWARE ENGINEERING, 2025, 32 (01)
  • [42] Fine-tuning and prompt engineering for large language models-based code review automation
    Pornprasit, Chanathip
    Tantithamthavorn, Chakkrit
    INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
  • [43] Optimized interaction with Large Language Models: A practical guide to Prompt Engineering and Retrieval-Augmented Generation
    Fink, Anna
    Rau, Alexander
    Kotter, Elmar
    Bamberg, Fabian
    Russe, Maximilian Frederik
    RADIOLOGIE, 2025,
  • [44] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
    Remadi, Adel
    El Hage, Karim
    Hobeika, Yasmina
    Bugiotti, Francesca
    DATA & KNOWLEDGE ENGINEERING, 2024, 152
  • [45] Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
    Guo, Qingyan
    Wang, Rui
    Guo, Junliang
    Tan, Xu
    Bian, Jiang
    Yang, Yujiu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11453 - 11464
  • [46] Ontology engineering with Large Language Models
    Mateiu, Patricia
    Groza, Adrian
    2023 25TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, SYNASC 2023, 2023, : 226 - 229
  • [47] PromptMaker: Prompt-based Prototyping with Large Language Models
    Jiang, Ellen
    Olson, Kristen
    Toh, Edwin
    Molina, Alejandra
    Donsbach, Aaron
    Terry, Michael
    Cai, Carrie J.
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
  • [48] Balancing Privacy and Robustness in Prompt Learning for Large Language Models
    Shi, Chiyu
    Su, Junyu
    Chu, Chiawei
    Wang, Baoping
    Feng, Duanyang
    MATHEMATICS, 2024, 12 (21)
  • [49] Response Generated by Large Language Models Depends on the Structure of the Prompt
    Sarangi, Pradosh Kumar
    Mondal, Himel
    INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2024, 34 (03): : 574 - 575
  • [50] Multimodal Emotion Captioning Using Large Language Model with Prompt Engineering
    Xu, Yaoxun
    Zhou, Yixuan
    Cai, Yunrui
    Xie, Jingran
    Ye, Runchuan
    Wu, Zhiyong
    PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON MULTIMODAL AND RESPONSIBLE AFFECTIVE COMPUTING, MRAC 2024, 2024, : 104 - 109