Mitigating spatial hallucination in large language models for path planning via prompt engineering

被引：0

作者：

Zhang, Hongjie ^{[1
]}

Deng, Hourui ^{[1
]}

Ou, Jie ^{[2
]}

Feng, Chaosheng ^{[1
]}

机构：

[1] Sichuan Normal Univ, Coll Comp Sci, Chengdu 610101, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China

来源：

SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期

关键词：

D O I：

10.1038/s41598-025-93601-5

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Spatial reasoning in Large Language Models (LLMs) serves as a foundation for embodied intelligence. However, even in simple maze environments, LLMs often struggle to plan correct paths due to hallucination issues. To address this, we propose S2ERS, an LLM-based technique that integrates entity and relation extraction with the on-policy reinforcement learning algorithm Sarsa for optimal path planning. We introduce three key improvements: (1) To tackle the hallucination of spatial, we extract a graph structure of entities and relations from the text-based maze description, aiding LLMs in accurately comprehending spatial relationships. (2) To prevent LLMs from getting trapped in dead ends due to context inconsistency hallucination by long-term reasoning, we insert the state-action value function Q into the prompts, guiding the LLM's path planning. (3) To reduce the token consumption of LLMs, we utilize multi-step reasoning, dynamically inserting local Q-tables into the prompt to assist the LLM in outputting multiple steps of actions at once. Our comprehensive experimental evaluation, conducted using closed-source LLMs ChatGPT 3.5, ERNIE-Bot 4.0 and open-source LLM ChatGLM-6B, demonstrates that S2ERS significantly mitigates the spatial hallucination issues in LLMs, and improves the success rate and optimal rate by approximately 29% and 19%, respectively, in comparison to the SOTA CoT methods.

引用

页数：13

共 50 条

[41] <monospace>BadCodePrompt</monospace>: backdoor attacks against prompt engineering of large language models for code generation
Qu, Yubin
Huang, Song
Li, Yanzhou
Bai, Tongtong
Chen, Xiang
Wang, Xingya
Li, Long
Yao, Yongming
AUTOMATED SOFTWARE ENGINEERING, 2025, 32 (01)
[42] Fine-tuning and prompt engineering for large language models-based code review automation
Pornprasit, Chanathip
Tantithamthavorn, Chakkrit
INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
[43] Optimized interaction with Large Language Models: A practical guide to Prompt Engineering and Retrieval-Augmented Generation
Fink, Anna
Rau, Alexander
Kotter, Elmar
Bamberg, Fabian
Russe, Maximilian Frederik
RADIOLOGIE, 2025,
[44] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
Remadi, Adel
El Hage, Karim
Hobeika, Yasmina
Bugiotti, Francesca
DATA & KNOWLEDGE ENGINEERING, 2024, 152
[45] Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
Guo, Qingyan
Wang, Rui
Guo, Junliang
Tan, Xu
Bian, Jiang
Yang, Yujiu
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11453 - 11464
[46] Ontology engineering with Large Language Models
Mateiu, Patricia
Groza, Adrian
2023 25TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, SYNASC 2023, 2023, : 226 - 229
[47] PromptMaker: Prompt-based Prototyping with Large Language Models
Jiang, Ellen
Olson, Kristen
Toh, Edwin
Molina, Alejandra
Donsbach, Aaron
Terry, Michael
Cai, Carrie J.
EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
[48] Balancing Privacy and Robustness in Prompt Learning for Large Language Models
Shi, Chiyu
Su, Junyu
Chu, Chiawei
Wang, Baoping
Feng, Duanyang
MATHEMATICS, 2024, 12 (21)
[49] Response Generated by Large Language Models Depends on the Structure of the Prompt
Sarangi, Pradosh Kumar
Mondal, Himel
INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2024, 34 (03): : 574 - 575
[50] Multimodal Emotion Captioning Using Large Language Model with Prompt Engineering
Xu, Yaoxun
Zhou, Yixuan
Cai, Yunrui
Xie, Jingran
Ye, Runchuan
Wu, Zhiyong
PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON MULTIMODAL AND RESPONSIBLE AFFECTIVE COMPUTING, MRAC 2024, 2024, : 104 - 109

← 1 2 3 4 5 →