Object-aware navigation for remote embodied visual referring expression

被引：2

作者：

Zhan, Zhaohuan ^{[1
]}

Lin, Liang ^{[2
]}

Tan, Guang ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Shenzhen Campus, Shenzhen, Guangdong, Peoples R China

[2] Sun Yat sen Univ, Guangzhou, Guangdong, Peoples R China

来源：

NEUROCOMPUTING | 2023年 / 515卷

关键词：

Vision -language navigation; Referring expression; Multimodal processing;

D O I：

10.1016/j.neucom.2022.10.026

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the Remote Embodied Visual Referring Expression (REVERIE) task, an agent needs to navigate through an unseen environment to identify a referred object following high-level instructions. Despite recent efforts of vision-and-language navigation (VLN), previous methods commonly rely on detailed naviga-tional instructions, which might not be available in practice. To address this issue, we present a method that strengthens vision-and-language (V&L) navigators with object-awareness. By combining object -aware textual grounding and visual grounding operations, our technique helps the navigator recognize the relationship between instructions and the contents of captured images. As a generic method, the pro-posed solution can be seamlessly integrated into other V&L navigators with different frameworks (for example, Seq2Seq or BERT). In order to alleviate the problem of data scarcity, we synthesize augmented data based on a simple yet effective prompt template that retains object information and destination information. Experimental results on REVERIE and R2R datasets demonstrate the proposed methods' applicability and performance improvement across different domains.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：68 / 78

页数：11

共 50 条

[1] Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression
Gao, Chen
Chen, Jinyu
Liu, Si
Wang, Luting
Zhang, Qiong
Wu, Qi
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3063 - 3072
[2] REVE-CE: Remote Embodied Visual Referring Expression in Continuous Environment
Li, Xinghang
Guo, Di
Liu, Huaping
Sun, Fuchun
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 1494 - 1501
[3] Query-based Object-aware Mapping for On-device Visual Language Mapping and Navigation
Yun, Jun Young
Kim, Pileun
Journal of Institute of Control, Robotics and Systems, 2024, 30 (10) : 1068 - 1075
[4] Spatial Perception by Object-Aware Visual Scene Representation
Lee, Chung-Yeon
Lee, Hyundo
Hwang, Injune
Zhang, Byoung-Tak
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3751 - 3758
[5] SiamCross: Siamese Cross Object-Aware Networks for Visual Object Tracking
Huang W.-H.
Feng Y.
Qiang B.-H.
Pei Y.-X.
Luo Y.
Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (10): : 2151 - 2166
[6] Object-aware data association for the semantically constrained visual SLAM
Yang Liu
Chi Guo
Yingli Wang
Intelligent Service Robotics, 2023, 16 : 155 - 176
[7] Object-Aware Tracking
Bogun, Ivan
Ribeiro, Eraldo
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 1695 - 1700
[8] Token-word mixer meets object-aware transformer for referring image segmentation
Zhang, Zhenliang
Teng, Zhu
Fan, Jack
Zhang, Baopeng
Fan, Jianping
PATTERN RECOGNITION, 2024, 155
[9] Object-aware data association for the semantically constrained visual SLAM
Liu, Yang
Guo, Chi
Wang, Yingli
INTELLIGENT SERVICE ROBOTICS, 2023, 16 (02) : 155 - 176
[10] Object-aware Identification of Microservices
Amiri, Mohammad Javad
2018 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (IEEE SCC 2018), 2018, : 253 - 256

← 1 2 3 4 5 →