Object-aware navigation for remote embodied visual referring expression

被引:2
|
作者
Zhan, Zhaohuan [1 ]
Lin, Liang [2 ]
Tan, Guang [1 ]
机构
[1] Sun Yat Sen Univ, Shenzhen Campus, Shenzhen, Guangdong, Peoples R China
[2] Sun Yat sen Univ, Guangzhou, Guangdong, Peoples R China
关键词
Vision -language navigation; Referring expression; Multimodal processing;
D O I
10.1016/j.neucom.2022.10.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the Remote Embodied Visual Referring Expression (REVERIE) task, an agent needs to navigate through an unseen environment to identify a referred object following high-level instructions. Despite recent efforts of vision-and-language navigation (VLN), previous methods commonly rely on detailed naviga-tional instructions, which might not be available in practice. To address this issue, we present a method that strengthens vision-and-language (V&L) navigators with object-awareness. By combining object -aware textual grounding and visual grounding operations, our technique helps the navigator recognize the relationship between instructions and the contents of captured images. As a generic method, the pro-posed solution can be seamlessly integrated into other V&L navigators with different frameworks (for example, Seq2Seq or BERT). In order to alleviate the problem of data scarcity, we synthesize augmented data based on a simple yet effective prompt template that retains object information and destination information. Experimental results on REVERIE and R2R datasets demonstrate the proposed methods' applicability and performance improvement across different domains.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:68 / 78
页数:11
相关论文
共 50 条
  • [41] An Object-Aware Hardware Transactional Memory System
    Khan, Behram
    Horsnell, Matthew
    Rogers, Ian
    Lujan, Mikel
    Dinn, Andrew
    Watson, Ian
    HPCC 2008: 10TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2008, : 93 - 102
  • [42] Scalable Object-Aware Hardware Transactional Memory
    Khan, Behram
    Horsnell, Matthew
    Lujan, Mikel
    Watson, Ian
    EURO-PAR 2010 PARALLEL PROCESSING, PT I, 2010, 6271 : 268 - 279
  • [43] OBJECT-AWARE SALIENCY DETECTION FOR CONSUMER IMAGES
    Tang, Hao
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1097 - 1100
  • [44] Object-aware Policy Network in Deep Recommender Systems
    Zhou, Guoqiang
    Xu, Zhangxian
    Lin, Jiayin
    Bao, Shudi
    Zhou, Liliang
    Shen, Jun
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2023, 95 (2-3): : 271 - 280
  • [45] Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
    Wang, Luting
    Liu, Yi
    Du, Penghui
    Ding, Zihan
    Liao, Yue
    Qi, Qiaosong
    Chen, Biaolong
    Liu, Si
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11186 - 11196
  • [46] A First Insight into Object-Aware Hardware Transactional Memory
    Khan, Behram
    Horsnell, Matthew
    Rogers, Ian
    Lujan, Mikel
    Dinn, Andrew
    Watson, Ian
    SPAA'08: PROCEEDINGS OF THE TWENTIETH ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2008, : 107 - 109
  • [47] An Object-aware Anomaly Detection and Localization in Surveillance Videos
    Zang, Xianghao
    Li, Ge
    Li, Zhihao
    Li, Nannan
    Wang, Wenmin
    2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 113 - 116
  • [48] PHILharmonicFlows: towards a framework for object-aware process management
    Kuenzle, Vera
    Reichert, Manfred
    JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION-RESEARCH AND PRACTICE, 2011, 23 (04): : 205 - 244
  • [49] Object-aware interactive perception for tabletop scene exploration
    Koc, Cagatay
    Sariel, Sanem
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 175
  • [50] Object-aware deep feature extraction for feature matching
    Li, Zuoyong
    Wang, Weice
    Lai, Taotao
    Xu, Haiping
    Keikhosrokiani, Pantea
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (05):