Reasoning with Language Model is Planning with World Model

被引:0
|
作者
Hao, Shibo [1 ]
Gu, Yi [1 ]
Ma, Haodi [2 ]
Hong, Joshua Jiahua [1 ]
Wang, Zhen [1 ,3 ]
Wang, Daisy Zhe [2 ]
Hu, Zhiting [1 ]
机构
[1] Univ Calif San Diego, La Jolla, CA 92093 USA
[2] Univ Florida, Gainesville, FL 32611 USA
[3] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have shown remarkable reasoning capabilities, particularly with chain-of-thought (CoT) prompting. However, LLMs sometimes still struggle with problems that are easy for humans, such as generating action plans to achieve given goals in an environment, or performing complex math or logical reasoning. The deficiency stems from the key fact that LLMs lack an internal world model to predict the world state (e.g., environment status, intermediate variable values) and simulate long-term outcomes of actions. This prevents LLMs from performing deliberate planning akin to human brains, which involves exploring alternative reasoning paths, anticipating future states and rewards, and iteratively refining existing reasoning steps. To overcome the limitations, we propose a new LLM reasoning framework, Reasoning via Planning (RAP). RAP repurposes the LLM as both a world model and a reasoning agent, and incorporates a principled planning algorithm based on Monte Carlo Tree Search for strategic exploration in the vast reasoning space. During reasoning, the LLM (as agent) incrementally builds a reasoning tree under the guidance of the LLM (as world model) and rewards, and efficiently obtains a high-reward reasoning path with a proper balance between exploration vs. exploitation. We apply RAP to various challenging reasoning problems including plan generation, math reasoning, and logical inference, and demonstrate its superiority over strong baselines. RAP with LLaMA-33B even surpasses CoT with GPT-4, achieving 33% relative improvement in a plan generation setting.(1)
引用
收藏
页码:8154 / 8173
页数:20
相关论文
共 50 条
  • [1] Spatial Reasoning and Planning in Sign-Based World Model
    Kiselev, Gleb
    Kovalev, Alexey
    Panov, Aleksandr I.
    ARTIFICIAL INTELLIGENCE (RCAI 2018), 2018, 934 : 1 - 10
  • [2] Reasoning with Language Model Prompting: A Survey
    Qiao, Shuofei
    Ou, Yixin
    Zhang, Ningyu
    Chen, Xiang
    Yao, Yunzhi
    Deng, Shumin
    Tan, Chuanqi
    Huang, Fei
    Chen, Huajun
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 5368 - 5393
  • [3] An expertise model for therapy planning using abductive reasoning
    Túnez, S
    Aguila, IM
    Marín, R
    CYBERNETICS AND SYSTEMS, 2001, 32 (08) : 829 - 849
  • [4] A Logic Language with Stable Model Semantics for Social Reasoning
    Buccafurri, Francesco
    Caminiti, Gianluca
    Laurendi, Rosario
    LOGIC PROGRAMMING, PROCEEDINGS, 2008, 5366 : 718 - 723
  • [5] Automatic Model Selection with Large Language Models for Reasoning
    Zhao, James Xu
    Xie, Yuxi
    Kawaguchi, Kenji
    He, Junxian
    Xie, Michael Qizhe
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 758 - 783
  • [6] A TEMPORAL REVISION MODEL FOR REASONING ABOUT WORLD CHANGE
    CORDIER, MO
    SIEGEL, P
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 1994, 9 (01) : 131 - 142
  • [7] A cognitive model of causal reasoning about the physical world
    Carassa, A
    Valpiani, A
    Geminiani, G
    Bandini, S
    TOPICS IN ARTIFICIAL INTELLIGENCE, 1995, 992 : 217 - 230
  • [8] EFFICIENT OPEN WORLD REASONING FOR PLANNING
    Babaian, Tamara
    Schmolze, James G.
    LOGICAL METHODS IN COMPUTER SCIENCE, 2006, 2 (03)
  • [9] Belief Reasoning Model for Mapping Public Participation in Transport Planning
    Kronprasert, Nopadon
    Talvitie, Antti P.
    BELIEF FUNCTIONS: THEORY AND APPLICATIONS (BELIEF 2014), 2014, 8764 : 143 - 152
  • [10] Belief reasoning model for mapping public participation in transport planning
    Kronprasert, Nopadon (nopkron@gmail.com), 1600, Springer Verlag (8764):