L3MVN: Leveraging Large Language Models for Visual Target Navigation

被引:10
|
作者
Yu, Bangguo [1 ]
Kasaei, Hamidreza [1 ]
Cao, Ming [1 ]
机构
[1] Univ Groningen, Fac Sci & Engn, Ne, NL-9747 AG Groningen, Netherlands
关键词
D O I
10.1109/IROS55552.2023.10342512
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual target navigation in unknown environments is a crucial problem in robotics. Despite extensive investigation of classical and learning-based approaches in the past, robots lack common-sense knowledge about household objects and layouts. Prior state-of-the-art approaches to this task rely on learning the priors during the training and typically require significant expensive resources and time for learning. To address this, we propose a new framework for visual target navigation that leverages Large Language Models (LLM) to impart common sense for object searching. Specifically, we introduce two paradigms: (i) zero-shot and (ii) feed-forward approaches that use language to find the relevant frontier from the semantic map as a long-term goal and explore the environment efficiently. Our analyse demonstrates the notable zero-shot generalization and transfer capabilities from the use of language. Experiments on Gibson and Habitat-Matterport 3D (HM3D) demonstrate that the proposed framework significantly outperforms existing map-based methods in terms of success rate and generalization. Ablation analyse also indicates that the common-sense knowledge from the language model leads to more efficient semantic exploration. Finally, we provide a real robot experiment to verify the applicability of our framework in real-world scenarios. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/l3mvn.
引用
收藏
页码:3554 / 3560
页数:7
相关论文
共 50 条
  • [31] Leveraging Large Language Models to Detect npm Malicious Packages
    Zahan, Nusrat
    Burckhardt, Philipp
    Lysenko, Mikola
    Aboukhadijeh, Feross
    Williams, Laurie
    arXiv,
  • [32] Leveraging Large Language Models for Efficient Alert Aggregation in AIOPs
    Zha, Junjie
    Shan, Xinwen
    Lu, Jiaxin
    Zhu, Jiajia
    Liu, Zihan
    ELECTRONICS, 2024, 13 (22)
  • [33] Leveraging Large Language Models for Decision Support in Personalized Oncology
    Benary, Manuela
    Wang, Xing David
    Schmidt, Max
    Soll, Dominik
    Hilfenhaus, Georg
    Nassir, Mani
    Sigler, Christian
    Knoedler, Maren
    Keller, Ulrich
    Beule, Dieter
    Keilholz, Ulrich
    Leser, Ulf
    Rieke, Damian T.
    JAMA NETWORK OPEN, 2023, 6 (11) : E2343689
  • [34] Leveraging large language models: transforming scholarly publishing for the better
    Fortier, Lisa A.
    JAVMA-JOURNAL OF THE AMERICAN VETERINARY MEDICAL ASSOCIATION, 2023, 261 (08): : 1106 - 1107
  • [35] Leveraging Large Language Models for Enhancing Safety in Maritime Operations
    Miller, Tymoteusz
    Durlik, Irmina
    Kostecka, Ewelina
    Lobodzinska, Adrianna
    Lazuga, Kinga
    Kozlovska, Polina
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [36] ERASMO: Leveraging Large Language Models for Enhanced Clustering Segmentation
    Silva, Fillipe dos Santos
    Kakimoto, Gabriel Kenzo
    dos Reis, Julio Cesar
    Reis, Marcelo S.
    INTELLIGENT SYSTEMS, BRACIS 2024, PT I, 2025, 15412 : 414 - 429
  • [37] Leveraging Large Language Models for Python']Python Unit Test
    Jiri, Medlen
    Emese, Bari
    Medlen, Patrick
    2024 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2024, : 95 - 100
  • [38] Leveraging Large Language Models for Automated Chinese Essay Scoring
    Feng, Haiyue
    Du, Sixuan
    Zhu, Gaoxia
    Zou, Yan
    Poh Boon Phua
    Feng, Yuhong
    Zhong, Haoming
    Shen, Zhiqi
    Liu, Siyuan
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, AIED 2024, 2024, 14829 : 454 - 467
  • [39] Leveraging large language models for daily tourist demand forecasting
    He, Kaijian
    Zheng, Linyuan
    Wu, Don
    Zou, Yingchao
    CURRENT ISSUES IN TOURISM, 2024,
  • [40] Leveraging Large Language Models for Automatic Smart Contract Generation
    Napoli, Emanuele Antonio
    Barbara, Fadi
    Gatteschi, Valentina
    Schifanella, Claudio
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 701 - 710