Goal Object Grounding and Multimodal Mapping for Multi-object Visual Navigation

被引:0
|
作者
Choi J. [1 ]
Kim I. [1 ]
机构
[1] Department of Computer Science, Kyonggi University
关键词
deep reinforcement learning; global mapping; goal grounding; multi-object visual navigation; reward function;
D O I
10.5302/J.ICROS.2024.23.0217
中图分类号
学科分类号
摘要
Multi-object visual navigation (MultiON) is a special type of visual navigation task that requires an embodied agent to visit multiple goal objects distributed over an unseen three-dimensional (3D) environment in a predefined order. To successfully execute MultiON, an agent should be able to accurately ground individual goal objects based on language descriptions regarding their color and shape attributes and build a semantically rich map that effectively covers the entire environment. In this paper, we propose a novel deep neural network-based agent model for performing MultiON tasks. The proposed model provides unique solutions to three different issues regarding MultiON agent design. First, the model adopts the pre-trained Grounding DINO module to ground the language descriptions of goal objects to the visual objects in input images in a zero-shot manner. Moreover, the model uses Bayesian posterior probabilities to effectively register the uncertain local contexts extracted from input images onto the global map. Finally, the model applies a novel reward function to efficiently motivate the agent to explore unvisited areas in the given environment for rapid and accurate map expansion. We demonstrate the superiority of the proposed model by conducting various quantitative and qualitative experiments using the 3D simulation platform, AI-Habitat, and the benchmark scene dataset, Matterport3D. © ICROS 2024.
引用
收藏
页码:596 / 606
页数:10
相关论文
共 50 条
  • [31] Object Hypotheses as Points for Efficient Multi-Object Tracking
    Tarashima, Shuhei
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 828 - 835
  • [32] Multi-object trajectory tracking
    Han, Mei
    Xu, Wei
    Tao, Hai
    Gong, Yihong
    MACHINE VISION AND APPLICATIONS, 2007, 18 (3-4) : 221 - 232
  • [33] Availability of multi-object operations
    Yu, Haifeng
    Gibbons, Phillip B.
    Nath, Suman
    USENIX ASSOCIATION PROCEEDINGS OF THE 3RD SYMPOSIUM ON NETWORKED SYSTEMS DESIGN & IMPLEMENTATION (NSDI 06), 2006, : 211 - +
  • [34] Multi-Object Spectroscopy with MUSE
    Kelz, Andreas
    Kamann, Sebastian
    Urrutia, Tanya
    Weilbacher, Peter
    Bacon, Roland
    MULTI-OBJECT SPECTROSCOPY IN THE NEXT DECADE: BIG QUESTIONS, LARGE SURVEYS, AND WIDE FIELDS, 2016, 507 : 323 - 327
  • [35] Multi-object tracking in video
    Agbinya, JI
    Rees, D
    REAL-TIME IMAGING, 1999, 5 (05) : 295 - 304
  • [36] Referring Multi-Object Tracking
    Wu, Dongming
    Han, Wencheng
    Wang, Tiancai
    Dong, Xingping
    Zhang, Xiangyu
    Shen, Jianbing
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14633 - 14642
  • [37] The GEMINI multi-object spectrographs
    AllingtonSmith, J
    Bettess, P
    Chadwick, E
    Content, R
    Davies, R
    Dodsworth, G
    Haynes, R
    Lee, D
    Lewis, I
    Webster, J
    Atad, E
    Beard, S
    Bennett, R
    Ellis, M
    Hastings, P
    Williams, P
    Bond, T
    Crampton, D
    Davidge, T
    Fletcher, M
    Leckie, B
    Morbey, C
    Murowinski, R
    Roberts, S
    Saddlemyer, L
    Sebesta, J
    Stilburn, J
    Szeto, K
    WIDE-FIELD SPECTROSCOPY, 1997, 212 : 73 - 79
  • [38] FALCON:: multi-object AO
    Gendron, E
    Assémat, F
    Hammer, F
    Jagourel, P
    Chemla, F
    Laporte, P
    Puech, M
    Marteaud, M
    Zamkotsian, F
    Liotard, A
    Conan, JM
    Fusco, T
    Hubin, N
    COMPTES RENDUS PHYSIQUE, 2005, 6 (10) : 1110 - 1117
  • [39] Multi-object spectrograph TAUMOK
    Ball, M
    Ziener, R
    WIDE-FIELD SPECTROSCOPY, 1997, 212 : 117 - 118
  • [40] Multi-Object Tracking in the Dark
    Wang, Xinzhe
    Ma, Kang
    Liu, Qiankun
    Zou, Yunhao
    Fu, Ying
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 382 - 392