ChatNav: Leveraging LLM to Zero-Shot Semantic Reasoning in Object Navigation

被引:0
|
作者
Zhu, Yong [1 ,2 ]
Wen, Zhenyu [1 ,2 ]
Li, Xiong [1 ,2 ]
Shi, Xiufang [1 ,2 ]
Wu, Xiang [1 ,2 ]
Dong, Hui [1 ,2 ]
Chen, Jiming [3 ,4 ]
机构
[1] Zhejiang Univ Technol, Inst Cyberspace Secur, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Informat Engn, Hangzhou 310023, Peoples R China
[3] Zhejiang Univ, Coll Control Sci & Engn, Hangzhou 310027, Peoples R China
[4] Hangzhou Dianzi Univ, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Semantics; Navigation; Robots; Cognition; TV; Accuracy; Chatbots; Large language models; Decision making; Pipelines; Object goal navigation; LLM; object clustering; prompt; gravity-repulsion model;
D O I
10.1109/TCSVT.2024.3485907
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In object goal navigation tasks, the robot's understanding of semantic relationships in the environment is a key factor in its ability to localize target objects. Previously, learning-based methods trained robots using 3D scene datasets to learn semantic relationships. However, these approaches perform poorly in new environments with unfamiliar semantic contexts. In this paper, we propose ChatNav which leverages the powerful knowledge summarizing and reasoning capabilities of a Large Language Model (LLM) for zero-shot inference of explicit semantic relationships. These relationships are further integrated into the navigation system for efficient localization of target objects. ChatNav employs a spatial object clustering algorithm to collect semantic clues and designs common-sense-based prompts for interacting with LLM. It then uses a gravity-repulsion model to convert inference results into heuristic factors for robust navigation decision-making. Our approach requires no additional training and can consistently obtain accurate semantic relationships from LLM, making it well-suited for navigating unknown environments. Experimental results demonstrate the outstanding navigation performance of our proposed method on the Gibson and HM3D datasets, surpassing the current state-of-the-art object goal navigation methods.
引用
收藏
页码:2369 / 2381
页数:13
相关论文
共 50 条
  • [1] Semantic Policy Network for Zero-Shot Object Goal Visual Navigation
    Zhao, Qianfan
    Zhang, Lu
    He, Bin
    Liu, Zhiyong
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (11) : 7655 - 7662
  • [2] Zero-Shot Object Goal Visual Navigation
    Zhao, Qianfan
    Zhang, Lu
    He, Bin
    Qiao, Hong
    Liu, Zhiyong
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2025 - 2031
  • [3] Prioritized Semantic Learning for Zero-Shot Instance Navigation
    Sun, Xinyu
    Liu, Lizhao
    Zhi, Hongyan
    Qiu, Ronghe
    Liang, Junwei
    COMPUTER VISION - ECCV 2024, PT XII, 2025, 15070 : 161 - 178
  • [4] TriHelper: Zero-Shot Object Navigation with Dynamic Assistance
    Zhang, Lingfeng
    Zhang, Qiang
    Wang, Hao
    Xiao, Erjia
    Jiang, Zixuan
    Chen, Honglei
    Xu, Renjing
    2024 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2024), 2024, : 10035 - 10042
  • [5] Zero-Shot Object Recognition by Semantic Manifold Distance
    Fu, Zhenyong
    Xiang, Tao
    Kodirov, Elyor
    Gong, Shaogang
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 2635 - 2644
  • [6] Leveraging Balanced Semantic Embedding for Generative Zero-Shot Learning
    Xie, Guo-Sen
    Zhang, Xu-Yao
    Xiang, Tian-Zhu
    Zhao, Fang
    Zhang, Zheng
    Shao, Ling
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 9575 - 9582
  • [7] Zero-shot object detection with contrastive semantic association network
    Haohe Li
    Chong Wang
    Weijie Liu
    Yilin Gong
    Xinmiao Dai
    Applied Intelligence, 2023, 53 : 30056 - 30068
  • [8] Zero-Shot Object Recognition Using Semantic Label Vectors
    Naha, Shujon
    Wang, Yang
    2015 12TH CONFERENCE ON COMPUTER AND ROBOT VISION CRV 2015, 2015, : 94 - 100
  • [9] Zero-shot Object Prediction using Semantic Scene Knowledge
    Grzeszick, Rene
    Fink, Gernot A.
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 5, 2017, : 120 - 129
  • [10] A dynamic semantic knowledge graph for zero-shot object detection
    Wen Lv
    Hongbo Shi
    Shuai Tan
    Bing Song
    Yang Tao
    The Visual Computer, 2023, 39 : 4513 - 4527