Semantic Policy Network for Zero-Shot Object Goal Visual Navigation

被引:3
|
作者
Zhao, Qianfan [1 ,2 ]
Zhang, Lu [1 ,2 ]
He, Bin [3 ]
Liu, Zhiyong [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodel Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100190, Peoples R China
[3] Tongji Univ, Coll Elect & Informat Engn, Shanghai 200070, Peoples R China
关键词
Deep learning; path planning; reinforcement learning; vision-based navigation;
D O I
10.1109/LRA.2023.3320014
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The task of zero-shot object goal visual navigation (ZSON) aims to enable robots to locate previously "unseen" objects by visual observations. This task presents a significant challenge since the robot must transfer the navigation policy learned from "seen" objects to "unseen" objects through auxiliary semantic information without training samples, a process known as zero-shot learning. In order to address this challenge, we propose a novel approach termed the Semantic Policy Network (SPNet). The SPNet consists of two modules that are deeply integrated with semantic embeddings: the Semantic Actor Policy (SAP) module and the Semantic Trajectory (ST) module. The SAP module generates actor network weight bias based on semantic embeddings, creating unique navigation policies for different target classes. The ST module records the robot's actions, visual features, and semantic embeddings at each step, and aggregates information in both the spatial and temporal dimensions. To evaluate our approach, we conducted extensive experiments using MP3D dataset, HM3D dataset, and RoboTHOR. Experimental results indicate that the proposed method outperforms other ZSON methods for both seen and unseen target classes.
引用
收藏
页码:7655 / 7662
页数:8
相关论文
共 50 条
  • [31] Zero-Shot Object Detection via Learning an Embedding from Semantic Space to Visual Space
    Zhang, Licheng
    Wang, Xianzhi
    Yao, Lina
    Wu, Lin
    Zheng, Feng
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 906 - 912
  • [32] Deep quantization network with visual-semantic alignment for zero-shot image retrieval
    Liu, Huixia
    Qin, Zhihong
    ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (07): : 4232 - 4247
  • [33] Zero-Shot Visual Recognition via Semantic Attention-Based Compare Network
    Nian, Fudong
    Sheng, Yikun
    Wang, Junfeng
    Li, Teng
    IEEE ACCESS, 2020, 8 : 26002 - 26011
  • [34] Zero-Shot Semantic Segmentation
    Bucher, Maxime
    Vu, Tuan-Hung
    Cord, Matthieu
    Perez, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [35] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
    Ye, Zihan
    Hu, Fuyuan
    Lyu, Fan
    Li, Linyan
    Huang, Kaizhu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
  • [36] Learning discriminative visual semantic embedding for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Yuan, Jianying
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
  • [37] Indirect visual–semantic alignment for generalized zero-shot recognition
    Yan-He Chen
    Mei-Chen Yeh
    Multimedia Systems, 2024, 30
  • [38] Transductive Visual-Semantic Embedding for Zero-shot Learning
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shao, Jie
    Huang, Zi
    PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 41 - 49
  • [39] Zero-Shot Object Counting
    Xu, Jingyi
    Le, Hieu
    Nguyen, Vu
    Ranjan, Viresh
    Samaras, Dimitris
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15548 - 15557
  • [40] Zero-Shot Object Detection
    Bansal, Ankan
    Sikka, Karan
    Sharma, Gaurav
    Chellappa, Rama
    Divakaran, Ajay
    COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 397 - 414