Semantic Policy Network for Zero-Shot Object Goal Visual Navigation

被引：3

作者：

Zhao, Qianfan ^{[1
,2
]}

Zhang, Lu ^{[1
,2
]}

He, Bin ^{[3
]}

Liu, Zhiyong ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodel Artificial Intelligence S, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100190, Peoples R China

[3] Tongji Univ, Coll Elect & Informat Engn, Shanghai 200070, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2023年 / 8卷 / 11期

关键词：

Deep learning; path planning; reinforcement learning; vision-based navigation;

D O I：

10.1109/LRA.2023.3320014

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The task of zero-shot object goal visual navigation (ZSON) aims to enable robots to locate previously "unseen" objects by visual observations. This task presents a significant challenge since the robot must transfer the navigation policy learned from "seen" objects to "unseen" objects through auxiliary semantic information without training samples, a process known as zero-shot learning. In order to address this challenge, we propose a novel approach termed the Semantic Policy Network (SPNet). The SPNet consists of two modules that are deeply integrated with semantic embeddings: the Semantic Actor Policy (SAP) module and the Semantic Trajectory (ST) module. The SAP module generates actor network weight bias based on semantic embeddings, creating unique navigation policies for different target classes. The ST module records the robot's actions, visual features, and semantic embeddings at each step, and aggregates information in both the spatial and temporal dimensions. To evaluate our approach, we conducted extensive experiments using MP3D dataset, HM3D dataset, and RoboTHOR. Experimental results indicate that the proposed method outperforms other ZSON methods for both seen and unseen target classes.

引用

页码：7655 / 7662

页数：8

共 50 条

[31] Zero-Shot Object Detection via Learning an Embedding from Semantic Space to Visual Space
Zhang, Licheng
Wang, Xianzhi
Yao, Lina
Wu, Lin
Zheng, Feng
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 906 - 912
[32] Deep quantization network with visual-semantic alignment for zero-shot image retrieval
Liu, Huixia
Qin, Zhihong
ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (07): : 4232 - 4247
[33] Zero-Shot Visual Recognition via Semantic Attention-Based Compare Network
Nian, Fudong
Sheng, Yikun
Wang, Junfeng
Li, Teng
IEEE ACCESS, 2020, 8 : 26002 - 26011
[34] Zero-Shot Semantic Segmentation
Bucher, Maxime
Vu, Tuan-Hung
Cord, Matthieu
Perez, Patrick
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[35] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
Ye, Zihan
Hu, Fuyuan
Lyu, Fan
Li, Linyan
Huang, Kaizhu
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
[36] Learning discriminative visual semantic embedding for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Yuan, Jianying
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
[37] Indirect visual–semantic alignment for generalized zero-shot recognition
Yan-He Chen
Mei-Chen Yeh
Multimedia Systems, 2024, 30
[38] Transductive Visual-Semantic Embedding for Zero-shot Learning
Xu, Xing
Shen, Fumin
Yang, Yang
Shao, Jie
Huang, Zi
PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 41 - 49
[39] Zero-Shot Object Counting
Xu, Jingyi
Le, Hieu
Nguyen, Vu
Ranjan, Viresh
Samaras, Dimitris
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15548 - 15557
[40] Zero-Shot Object Detection
Bansal, Ankan
Sikka, Karan
Sharma, Gaurav
Chellappa, Rama
Divakaran, Ajay
COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 397 - 414

← 1 2 3 4 5 →