L3MVN: Leveraging Large Language Models for Visual Target Navigation

被引:10
|
作者
Yu, Bangguo [1 ]
Kasaei, Hamidreza [1 ]
Cao, Ming [1 ]
机构
[1] Univ Groningen, Fac Sci & Engn, Ne, NL-9747 AG Groningen, Netherlands
关键词
D O I
10.1109/IROS55552.2023.10342512
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual target navigation in unknown environments is a crucial problem in robotics. Despite extensive investigation of classical and learning-based approaches in the past, robots lack common-sense knowledge about household objects and layouts. Prior state-of-the-art approaches to this task rely on learning the priors during the training and typically require significant expensive resources and time for learning. To address this, we propose a new framework for visual target navigation that leverages Large Language Models (LLM) to impart common sense for object searching. Specifically, we introduce two paradigms: (i) zero-shot and (ii) feed-forward approaches that use language to find the relevant frontier from the semantic map as a long-term goal and explore the environment efficiently. Our analyse demonstrates the notable zero-shot generalization and transfer capabilities from the use of language. Experiments on Gibson and Habitat-Matterport 3D (HM3D) demonstrate that the proposed framework significantly outperforms existing map-based methods in terms of success rate and generalization. Ablation analyse also indicates that the common-sense knowledge from the language model leads to more efficient semantic exploration. Finally, we provide a real robot experiment to verify the applicability of our framework in real-world scenarios. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/l3mvn.
引用
收藏
页码:3554 / 3560
页数:7
相关论文
共 50 条
  • [21] On Leveraging Large Language Models for Multilingual Intent Discovery
    Chow, Rudolf
    Suen, King yiu
    Lam, Albert Y. S.
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2025, 16 (01)
  • [22] Leveraging large language models for academic conference organization
    Luo, Yuan
    Li, Yikuan
    Ogunyemi, Omolola
    Koski, Eileen
    Himes, Blanca E.
    NPJ DIGITAL MEDICINE, 2025, 8 (01):
  • [23] LEVERAGING LARGE LANGUAGE MODELS WITH VOCABULARY SHARING FOR SIGN LANGUAGE TRANSLATION
    Lee, Huije
    Kim, Jung-Ho
    Hwang, Eui Jun
    Kim, Jaewoo
    Park, Jong C.
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [24] Leveraging large language models: transforming scholarly publishing for the better
    Fortier, Lisa A.
    AMERICAN JOURNAL OF VETERINARY RESEARCH, 2023, 84 (08) : 1 - 2
  • [25] Leveraging Large Language Models for Analysis of Student Course Feedback
    Wang, Zixuan
    Denny, Paul
    Leinonen, Juho
    Luxton-Reilly, Andrew
    PROCEEDINGS OF THE 16TH ANNUAL ACM INDIA COMPUTE CONFERENCE, COMPUTE 2023, 2023, : 76 - 79
  • [26] Leveraging foundation and large language models in medical artificial intelligence
    Wong Io Nam
    Monteiro Olivia
    BaptistaHon Daniel T
    Wang Kai
    Lu Wenyang
    Sun Zhuo
    Nie Sheng
    Yin Yun
    中华医学杂志英文版, 2024, 137 (21)
  • [27] Leveraging large language models to monitor climate technology innovation
    Toetzke, Malte
    Probst, Benedict
    Feuerriegel, Stefan
    ENVIRONMENTAL RESEARCH LETTERS, 2023, 18 (09)
  • [28] Position Paper: Leveraging Large Language Models for Cybersecurity Compliance
    Salman, Ahmed
    Creese, Sadie
    Goldsmith, Michael
    9TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS, EUROS&PW 2024, 2024, : 496 - 503
  • [29] Leveraging foundation and large language models in medical artificial intelligence
    Wong, Io Nam
    Monteiro, Olivia
    Baptista-Hon, Daniel T.
    Wang, Kai
    Lu, Wenyang
    Sun, Zhuo
    Nie, Sheng
    Yin, Yun
    CHINESE MEDICAL JOURNAL, 2024, 137 (21) : 2529 - 2539
  • [30] Leveraging Large Language Models for Activity Recognition in Smart Environments
    Cleland, Ian
    Nugent, Luke
    Cruciani, Federico
    Nugent, Chris
    2024 INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING, ABC 2024, 2024,