Reasoning with Language Model is Planning with World Model

被引:0
|
作者
Hao, Shibo [1 ]
Gu, Yi [1 ]
Ma, Haodi [2 ]
Hong, Joshua Jiahua [1 ]
Wang, Zhen [1 ,3 ]
Wang, Daisy Zhe [2 ]
Hu, Zhiting [1 ]
机构
[1] Univ Calif San Diego, La Jolla, CA 92093 USA
[2] Univ Florida, Gainesville, FL 32611 USA
[3] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have shown remarkable reasoning capabilities, particularly with chain-of-thought (CoT) prompting. However, LLMs sometimes still struggle with problems that are easy for humans, such as generating action plans to achieve given goals in an environment, or performing complex math or logical reasoning. The deficiency stems from the key fact that LLMs lack an internal world model to predict the world state (e.g., environment status, intermediate variable values) and simulate long-term outcomes of actions. This prevents LLMs from performing deliberate planning akin to human brains, which involves exploring alternative reasoning paths, anticipating future states and rewards, and iteratively refining existing reasoning steps. To overcome the limitations, we propose a new LLM reasoning framework, Reasoning via Planning (RAP). RAP repurposes the LLM as both a world model and a reasoning agent, and incorporates a principled planning algorithm based on Monte Carlo Tree Search for strategic exploration in the vast reasoning space. During reasoning, the LLM (as agent) incrementally builds a reasoning tree under the guidance of the LLM (as world model) and rewards, and efficiently obtains a high-reward reasoning path with a proper balance between exploration vs. exploitation. We apply RAP to various challenging reasoning problems including plan generation, math reasoning, and logical inference, and demonstrate its superiority over strong baselines. RAP with LLaMA-33B even surpasses CoT with GPT-4, achieving 33% relative improvement in a plan generation setting.(1)
引用
收藏
页码:8154 / 8173
页数:20
相关论文
共 50 条
  • [31] Explicit Planning Helps Language Models in Logical Reasoning
    Zhao, Hongyu
    Wang, Kangrui
    Yu, Mo
    Mei, Hongyuan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11155 - 11173
  • [32] SDT: An integrated model for open-world knowledge graph reasoning
    Chen, Xiaojun
    Jia, Shengbin
    Ding, Ling
    Shen, Hong
    Xiang, Yang
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 162
  • [33] LEGAL REASONING AS A MODEL FOR MORAL REASONING
    GOLDMAN, AH
    LAW AND PHILOSOPHY, 1989, 8 (01) : 131 - 149
  • [34] A model for relational reasoning as verbal reasoning
    Krumnack, Antje
    Bucher, Leandra
    Nejasmic, Jelica
    Nebel, Bernhard
    Knauff, Markus
    COGNITIVE SYSTEMS RESEARCH, 2011, 12 (3-4) : 377 - 392
  • [35] GRAVITAS: A model checking based planning and goal reasoning framework for autonomous systems
    Bride, Hadrien
    Dong, Jin Song
    Green, Ryan
    Hou, Zhe
    Mahony, Brendan
    Oxenham, Martin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 97
  • [36] Sound and efficient closed-world reasoning for planning
    Etzioni, O
    Golden, K
    Weld, DS
    ARTIFICIAL INTELLIGENCE, 1997, 89 (1-2) : 113 - 148
  • [37] KNOWLEDGE ACQUISITION AND REASONING WITH A CANONICAL GRAPH MODEL IN PERSONAL FINANCIAL-PLANNING
    GARNER, BJ
    TSUI, E
    EXPERT SYSTEMS IN ECONOMICS, BANKING AND MANAGEMENT, 1989, : 97 - 108
  • [38] Conceptual reasoning model for supporting strategic planning of dental implant surgical process
    Szejka, Anderson Luis
    Rudek, Marcelo
    Perussolo, Jose Mauricio
    Canciglieri Junior, Osiris
    IMPROVING COMPLEX SYSTEMS TODAY, 2011, : 249 - 256
  • [39] Bidirectional Planning for Autonomous Driving Framework with Large Language Model
    Ma, Zhikun
    Sun, Qicong
    Matsumaru, Takafumi
    SENSORS, 2024, 24 (20)
  • [40] Task Planning for a Factory Robot Using Large Language Model
    Tsushima, Yosuke
    Yamamoto, Shu
    Ravankar, Ankit A.
    Luces, Jose Victorio Salazar
    Hirata, Yasuhisa
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2383 - 2390