Multi-Object Navigation Using Potential Target Position Policy Function

被引:4
|
作者
Zeng, Haitao [1 ]
Song, Xinhang [1 ]
Jiang, Shuqiang [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Navigation; Task analysis; Semantics; Visualization; Reinforcement learning; Trajectory; Three-dimensional displays; Multi-object navigation; object navigation; embodied AI;
D O I
10.1109/TIP.2023.3263110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual object navigation is an essential task of embodied AI, which is letting the agent navigate to the goal object under the user's demand. Previous methods often focus on single-object navigation. However, in real life, human demands are generally continuous and multiple, requiring the agent to implement multiple tasks in sequence. These demands can be addressed by repeatedly performing previous single task methods. However, by dividing multiple tasks into several independent tasks to perform, without the global optimization between different tasks, the agents' trajectories may overlap, reducing the efficiency of navigation. In this paper, we propose an efficient reinforcement learning framework with a hybrid policy for multi-object navigation, aiming to maximally eliminate noneffective actions. First, the visual observations are embedded to detect the semantic entities (such as objects). And the detected objects are memorized and projected into semantic maps, which can also be regarded as a long-term memory of the observed environment. Then a hybrid policy consisting of exploration and long-term planning strategies is proposed to predict the potential target position. In particular, when the target is directly oriented, the policy function makes long-term planning for the target based on the semantic map, which is implemented by a sequence of motion actions. In the alternative, when the target is not oriented, the policy function estimates an object's potential position toward exploring the most possible objects (positions) that have close relations to the target. The relation between different objects is obtained with prior knowledge, which is used to predict the potential target position by integrating with the memorized semantic map. And then a path to the potential target is planned by the policy function. We evaluate our proposed method on two large-scale 3D realistic environment datasets, Gibson and Matterport3D, and the experimental results demonstrate the effectiveness and generalization of the proposed method.
引用
收藏
页码:2608 / 2619
页数:12
相关论文
共 50 条
  • [41] Multi-Object Tracking using Least Absolute Deviation
    Wang, Bing
    Wang, Fuxiang
    2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP 2014), 2014, : 60 - 65
  • [42] A multi-object spectrometer using micro mirror arrays
    MacKenty, JW
    Stiavelli, M
    IMAGING THE UNIVERSE IN THREE DIMENSIONS: ASTROPHYSICS WITH ADVANCED MULTI-WAVELENGTH IMAGING DEVICES, 2000, 195 : 443 - 448
  • [43] Recognition of multi-object events using attribute grammars
    Joo, Seong-Wook
    Chellappa, Rama
    2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, : 2897 - +
  • [44] Motion multi-object matching and position estimation based on unsynchronized image sequences
    Kai Guo
    Rui Cao
    Chenyang Yue
    Faxin Li
    Xin Zhou
    Binbin Wang
    Scientific Reports, 15 (1)
  • [45] Robot Crowd Navigation using Predictive Position Fields in the Potential Function Framework
    Pradhan, Ninad
    Burg, Timothy
    Birchfield, Stan
    2011 AMERICAN CONTROL CONFERENCE, 2011, : 4628 - 4633
  • [46] Online Multi-object Tracking Using Single Object Tracker and Markov Clustering
    Zhu, Jiao
    Zhang, Shanshan
    Yang, Jian
    IMAGE AND GRAPHICS, ICIG 2019, PT III, 2019, 11903 : 555 - 567
  • [47] 3D object classification using multi-object Kohonen networks
    Corridoni, JM
    DelBimbo, A
    Landi, L
    PATTERN RECOGNITION, 1996, 29 (06) : 919 - 935
  • [48] Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations
    Heravi, Negin
    Wahid, Ayzaan
    Lynch, Corey
    Florence, Pete
    Armstrong, Travis
    Tompson, Jonathan
    Sermanet, Pierre
    Bohg, Jeannette
    Dwibedi, Debidatta
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9515 - 9522
  • [49] Probabilistic fibre-to-target assignment algorithm for multi-object spectroscopic surveys
    Tempel, E.
    Norberg, P.
    Tuvikene, T.
    Bensby, T.
    Chiappini, C.
    Christlieb, N.
    Cioni, M. -R. L.
    Comparat, J.
    Davies, L. J. M.
    Guiglion, G.
    Koch, A.
    Kordopatis, G.
    Krumpe, M.
    Loveday, J.
    Merloni, A.
    Micheva, G.
    Minchev, I.
    Roukema, B. F.
    Sorce, J. G.
    Starkenburg, E.
    Storm, J.
    Swann, E.
    Thi, W. F.
    Traven, G.
    de Jong, R. S.
    ASTRONOMY & ASTROPHYSICS, 2020, 635
  • [50] Online multi-object tracking: multiple instance based target appearance model
    Badal, Tapas
    Nain, Neeta
    Ahmed, Mushtaq
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (19) : 25199 - 25221