Visual Object Search by Learning Spatial Context

被引:52
|
作者
Druon, Raphael [1 ,2 ]
Yoshiyasu, Yusuke [2 ,3 ]
Kanezaki, Asako [3 ]
Watt, Alassane [2 ,4 ]
机构
[1] Paul Sabatier Univ, F-31330 Toulouse, France
[2] CNRS AIST Joint Robot Lab, Tsukuba, Ibaraki 3058560, Japan
[3] Natl Inst Adv Ind Sci & Technol, Tokyo 1350064, Japan
[4] Cent Supelec, Rennes, France
关键词
Deep learning in robotics and automation; visual-based navigation; autonomous agents; OBSTACLE AVOIDANCE;
D O I
10.1109/LRA.2020.2967677
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
We present a visual navigation approach that uses context information to navigate an agent to find and reach a target object. To learn context from the objects present in the scene, we transform visual information into an intermediate representation called context grid which essentially represents how much the object at the location is semantically similar to the target object. As this representation can encode the target object and other objects together, it allows us to navigate an agent in a human-inspired way: the agent will go to the likely place by seeing surrounding context objects in the beginning when the target is not visible and, once the target object comes into sight, it will reach the target quickly. Since context grid does not directly contain visual or semantic feature values that change according to introductions of new objects, such as new instances of the same object with different appearance or an object from a slightly different class, our navigation model generalizes well to unseen scenes/objects. Experimental results show that our approach outperforms previous approaches in navigating in unseen scenes, especially for broad scenes. We also evaluated human performances in the target-driven navigation task and compared with machine learning based navigation approaches including this work.
引用
收藏
页码:1279 / 1286
页数:8
相关论文
共 50 条
  • [11] Top - down strategy affects learning of visual context in visual search
    Endo, N.
    PERCEPTION, 2008, 37 : 8 - 8
  • [12] Visual search is relational without prior context learning
    Becker, Stefanie I.
    Hamblin-Frohman, Zachary
    Amarasekera, Koralalage Don Raveen
    COGNITION, 2025, 260
  • [13] Learning Spatial Fusion and Matching for Visual Object Tracking
    Xiao, Wei
    Zhang, Zili
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 352 - 367
  • [14] Spatial context and top-down strategies in visual search
    Lleras, A
    Von Mühlenen, A
    SPATIAL VISION, 2004, 17 (4-5): : 465 - 482
  • [15] Visual Attentional Network and Learning Method for Object Search and Recognition
    Lü J.
    Luo F.
    Yuan Z.
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2019, 55 (11): : 123 - 130
  • [16] Learning by selection: Visual search and object perception in young infants
    Amso, Dima
    Johnson, Scott P.
    DEVELOPMENTAL PSYCHOLOGY, 2006, 42 (06) : 1236 - 1245
  • [17] A model of spatial and object-based attention for active visual search
    Lanyon, L
    Denham, S
    MODELING LANGUAGE, COGNITION AND ACTION, 2005, 16 : 239 - 248
  • [18] Object and Spatial Context Representations in Visual Short-Term Memory
    Li, Aedan Y.
    ENEURO, 2021, 8 (02)
  • [19] Oculomotor correlates of context-guided learning in visual search
    Tseng, YC
    Li, CSR
    PERCEPTION & PSYCHOPHYSICS, 2004, 66 (08): : 1363 - 1378
  • [20] Oculomotor correlates of context-guided learning in visual search
    Yuan-Chi Tseng
    Chiang-Shan Ray Li
    Perception & Psychophysics, 2004, 66 : 1363 - 1378