Layout-Aware Dreamer for Embodied Referring Expression Grounding

被引:0
|
作者
Li, Mingxiao [1 ]
Wang, Zehao [2 ]
Tuytelaars, Tinne [2 ]
Moens, Marie-Francine [1 ]
机构
[1] Katholieke Univ Leuven, Comp Sci Dept, Leuven, Belgium
[2] Katholieke Univ Leuven, Elect Engn Dept ESAT PSI, Leuven, Belgium
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we study the problem of Embodied Referring Expression Grounding, where an agent needs to navigate in a previously unseen environment and to localize a remote object described by a concise high-level natural language instruction. When facing such a situation, a human tends to imagine what the destination may look like and to explore the environment based on prior knowledge of the environmental layout, such as the fact that a bathroom is more likely to be found near a bedroom than a kitchen. We have de-signed an autonomous agent called Layout-aware Dreamer (LAD), including two novel modules, that is, the Layout Learner and the Goal Dreamer to mimic this cognitive decision process. The Layout Learner learns to infer the room category distribution of neighboring unexplored areas along the path for coarse layout estimation, which effectively introduces layout common sense of room-to-room transitions to our agent. To learn an effective exploration of the environment, the Goal Dreamer imagines the destination before-hand. Our agent achieves new state-of-the-art performance on the public leaderboard of the REVERIE dataset in challenging unseen test environments with improvement in navigation success (SR) by 4.02% and remote grounding success (RGS) by 3.43% compared to the previous state-of-the-art. The code is released at https://github.com/zehao-wang/LAD
引用
收藏
页码:1386 / 1395
页数:10
相关论文
共 50 条
  • [41] LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
    Nguyen, Laura
    Scialom, Thomas
    Piwowarski, Benjamin
    Staiano, Jacopo
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 636 - 651
  • [42] A layout-aware analysis of networks-on-chip and traditional interconnects for MPSoCs
    Angiolini, Federico
    Meloni, Paolo
    Carta, Salvatore M.
    Raffo, Luigi
    Benini, Luca
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2007, 26 (03) : 421 - 434
  • [43] Layout-aware Delay Variation Optimization for CNTFET-based Circuits
    Beste, Matthias
    Kiamehr, Saman
    Tahoori, Mehdi B.
    2014 27TH INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2014 13TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID 2014), 2014, : 393 - 398
  • [44] Layout-Aware Semi-automatic Information Extraction for Pharmaceutical Documents
    Harmata, Simon
    Hofer-Schmitz, Katharina
    Phuong-Ha Nguyen
    Quix, Christoph
    Bakiu, Bujar
    DATA INTEGRATION IN THE LIFE SCIENCES, DILS 2017, 2017, 10649 : 71 - 85
  • [45] Layout-Aware Switching Activity Localization to Enhance Hardware Trojan Detection
    Salmani, Hassan
    Tehranipoor, Mohammad
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2012, 7 (01) : 76 - 87
  • [46] A Layout-Aware Test Methodology for Silicon Interposer in System-in-a-Package
    Li, Katherine Shu-Min
    Ho, Cheng-You
    Gu, Ruei-Ting
    Wang, Sying-Jyan
    Ho, Yingchieh
    Huang, Jiun-Jie
    Cheng, Bo-Chuan
    Liu, An-Ting
    2013 22ND ASIAN TEST SYMPOSIUM (ATS), 2013, : 159 - +
  • [47] Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment
    She, Dongyu
    Lai, Yu-Kun
    Yi, Gaoxiong
    Xu, Kun
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8471 - 8480
  • [48] DocLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding
    Wang, Dongsheng
    Raman, Natraj
    Sibue, Mathieu
    Ma, Zhiqiang
    Babkin, Petr
    Kaur, Simerjot
    Pei, Yulong
    Nourbakhsh, Armineh
    Liu, Xiaomo
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8529 - 8548
  • [49] CoLaFUZE: Coverage-Guided and Layout-Aware Fuzzing for Android Drivers
    Mu, Tianshi
    Zhang, Huabing
    Wang, Jian
    Li, Huijuan
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (11): : 1902 - 1912
  • [50] A Highly Efficient Layout-Aware FPGA Overlay Accelerator Mapping Method
    Ahmed, Tanvir
    Kuhn, Johannes Maximilian
    Namura, Ken
    2021 IEEE 14TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2021), 2021, : 265 - 272