Object-aware navigation for remote embodied visual referring expression

被引：2

作者：

Zhan, Zhaohuan ^{[1
]}

Lin, Liang ^{[2
]}

Tan, Guang ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Shenzhen Campus, Shenzhen, Guangdong, Peoples R China

[2] Sun Yat sen Univ, Guangzhou, Guangdong, Peoples R China

来源：

NEUROCOMPUTING | 2023年 / 515卷

关键词：

Vision -language navigation; Referring expression; Multimodal processing;

D O I：

10.1016/j.neucom.2022.10.026

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the Remote Embodied Visual Referring Expression (REVERIE) task, an agent needs to navigate through an unseen environment to identify a referred object following high-level instructions. Despite recent efforts of vision-and-language navigation (VLN), previous methods commonly rely on detailed naviga-tional instructions, which might not be available in practice. To address this issue, we present a method that strengthens vision-and-language (V&L) navigators with object-awareness. By combining object -aware textual grounding and visual grounding operations, our technique helps the navigator recognize the relationship between instructions and the contents of captured images. As a generic method, the pro-posed solution can be seamlessly integrated into other V&L navigators with different frameworks (for example, Seq2Seq or BERT). In order to alleviate the problem of data scarcity, we synthesize augmented data based on a simple yet effective prompt template that retains object information and destination information. Experimental results on REVERIE and R2R datasets demonstrate the proposed methods' applicability and performance improvement across different domains.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：68 / 78

页数：11

共 50 条

[41] An Object-Aware Hardware Transactional Memory System
Khan, Behram
Horsnell, Matthew
Rogers, Ian
Lujan, Mikel
Dinn, Andrew
Watson, Ian
HPCC 2008: 10TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2008, : 93 - 102
[42] Scalable Object-Aware Hardware Transactional Memory
Khan, Behram
Horsnell, Matthew
Lujan, Mikel
Watson, Ian
EURO-PAR 2010 PARALLEL PROCESSING, PT I, 2010, 6271 : 268 - 279
[43] OBJECT-AWARE SALIENCY DETECTION FOR CONSUMER IMAGES
Tang, Hao
2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1097 - 1100
[44] Object-aware Policy Network in Deep Recommender Systems
Zhou, Guoqiang
Xu, Zhangxian
Lin, Jiayin
Bao, Shudi
Zhou, Liliang
Shen, Jun
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2023, 95 (2-3): : 271 - 280
[45] Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Wang, Luting
Liu, Yi
Du, Penghui
Ding, Zihan
Liao, Yue
Qi, Qiaosong
Chen, Biaolong
Liu, Si
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11186 - 11196
[46] A First Insight into Object-Aware Hardware Transactional Memory
Khan, Behram
Horsnell, Matthew
Rogers, Ian
Lujan, Mikel
Dinn, Andrew
Watson, Ian
SPAA'08: PROCEEDINGS OF THE TWENTIETH ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2008, : 107 - 109
[47] An Object-aware Anomaly Detection and Localization in Surveillance Videos
Zang, Xianghao
Li, Ge
Li, Zhihao
Li, Nannan
Wang, Wenmin
2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 113 - 116
[48] PHILharmonicFlows: towards a framework for object-aware process management
Kuenzle, Vera
Reichert, Manfred
JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION-RESEARCH AND PRACTICE, 2011, 23 (04): : 205 - 244
[49] Object-aware interactive perception for tabletop scene exploration
Koc, Cagatay
Sariel, Sanem
ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 175
[50] Object-aware deep feature extraction for feature matching
Li, Zuoyong
Wang, Weice
Lai, Taotao
Xu, Haiping
Keikhosrokiani, Pantea
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (05):

← 1 2 3 4 5 →