Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning

被引:0
|
作者
Shah, Dhruv [1 ]
Equi, Michael [1 ]
Osinski, Blazej [3 ]
Xia, Fei [2 ]
Ichter, Brian [2 ]
Levine, Sergey [1 ,2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Google DeepMind, London, England
[3] Univ Warsaw, Warsaw, Poland
来源
关键词
navigation; language models; planning; semantic scene understanding; VISION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Navigation in unfamiliar environments presents a major challenge for robots: while mapping and planning techniques can be used to build up a representation of the world, quickly discovering a path to a desired goal in unfamiliar settings with such methods often requires lengthy mapping and exploration. Humans can rapidly navigate new environments, particularly indoor environments that are laid out logically, by leveraging semantics-e.g., a kitchen often adjoins a living room, an exit sign indicates the way out, and so forth. Language models can provide robots with such knowledge, but directly using language models to instruct a robot how to reach some destination can also be impractical: while language models might produce a narrative about how to reach some goal, because they are not grounded in real-world observations, this narrative might be arbitrarily wrong. Therefore, in this paper we study how the "semantic guesswork" produced by language models can be utilized as a guiding heuristic for planning algorithms. Our method, Language Frontier Guide (LFG), uses the language model to bias exploration of novel real-world environments by incorporating the semantic knowledge stored in language models as a search heuristic for planning with either topological or metric maps. We evaluate LFG in challenging real-world environments and simulated benchmarks, outperforming uninformed exploration and other ways of using language models.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Exploring the use of large language models to build product Kansei semantic spaces
    Alcaide-Marzal, Jorge
    Diego-Mas, Jose Antonio
    INTERNATIONAL JOURNAL OF INDUSTRIAL ERGONOMICS, 2025, 107
  • [42] SCALM: Towards Semantic Caching for Automated Chat Services with Large Language Models
    Li, Jiaxing
    Xu, Chi
    Wang, Feng
    von Riedemann, Isaac M.
    Zhang, Cong
    Liu, Jiangchuan
    2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
  • [43] Evaluation and Analysis of the Chinese Semantic Dependency Understanding Ability of Large Language Models
    Shen, Zizhuo
    Li, Wei
    Shao, Yanqiu
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 92 - 104
  • [44] Complex Motion Planning for Quadruped Robots Using Large Language Models
    Zhang, Xiang
    He, Run
    Tong, Kai
    Man, Shuquan
    Tong, Jingyu
    Li, Haodong
    Zhuang, Huiping
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [45] A framework for neurosymbolic robot action planning using large language models
    Capitanelli, Alessio
    Mastrogiovanni, Fulvio
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [46] Knowledge Augmentation and Task Planning in Large Language Models for Dexterous Grasping
    Li, Hui
    Tran, Dang
    Zhang, Xinyu
    He, Hongsheng
    2023 IEEE-RAS 22ND INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, HUMANOIDS, 2023,
  • [47] Landmarking for navigation of large models
    Hubbold, R
    Keates, M
    COMPUTERS & GRAPHICS-UK, 1999, 23 (05): : 729 - 738
  • [48] Instruction-guided path planning with 3D semantic maps for vision-language navigation
    Wang, Zehao
    Li, Mingxiao
    Wu, Minye
    Moens, Marie-Francine
    Tuytelaars, Tinne
    NEUROCOMPUTING, 2025, 625
  • [49] L3MVN: Leveraging Large Language Models for Visual Target Navigation
    Yu, Bangguo
    Kasaei, Hamidreza
    Cao, Ming
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3554 - 3560
  • [50] Semantic integration and syntactic planning in language production
    Solomon, ES
    Pearlmutter, NJ
    COGNITIVE PSYCHOLOGY, 2004, 49 (01) : 1 - 46