Double Graph Attention Networks for Visual Semantic Navigation

被引:2
|
作者
Lyu, Yunlian [1 ,2 ]
Talebi, Mohammad Sadegh [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Xiyuan Ave, Chengdu 611731, Sichuan, Peoples R China
[2] Univ Copenhagen, Dept Comp Sci, Univ Pk 1, DK-2100 Copenhagen, Denmark
关键词
Deep reinforcement learning; Visual navigation; Knowledge graph; Graph convolutional networks; Spatial attention; REINFORCEMENT; LANGUAGE; ROBOT;
D O I
10.1007/s11063-023-11190-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial Intelligence (AI) based on knowledge graphs has been invested in realizing human intelligence like thinking, learning, and logical reasoning. It is a great promise to make AI-based systems not only intelligent but also knowledgeable. In this paper, we investigate knowledge graph based visual semantic navigation using deep reinforcement learning, where an agent reasons actions against targets specified by text words in indoor scenes. The agent perceives its surroundings through egocentric RGB views and learns via trial-and-error. The fundamental problem of visual navigation is efficient learning across different targets and scenes. To obtain an empirical model, we propose a spatial attention model with knowledge graphs, DGVN, which combines both semantic information about observed objects and spatial information about their locations. Our spatial attention model is constructed based on interactions between a 3D global graph and local graphs. The two graphs we adopted encode the spatial relationships between objects and are expected to guide policy search effectively. With the knowledge graph and its robust feature representation using graph convolutional networks, we demonstrate that our agent is able to infer a more plausible attention mechanism for decision-making. Under several experimental metrics, our attention model is shown to achieve superior navigation performance in the AI2-THOR environment.
引用
收藏
页码:9019 / 9040
页数:22
相关论文
共 50 条
  • [1] Double Graph Attention Networks for Visual Semantic Navigation
    Yunlian Lyu
    Mohammad Sadegh Talebi
    Neural Processing Letters, 2023, 55 : 9019 - 9040
  • [2] Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection
    Liang, Zhijun
    Liu, Junfa
    Guan, Yisheng
    Rojas, Juan
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021), 2021, : 1441 - 1447
  • [3] MaAST: Map Attention with Semantic Transformers for Efficient Visual Navigation
    Seymour, Zachary
    Thopalli, Kowshik
    Mithun, Niluthpol
    Chiu, Han-Pang
    Samarasekera, Supun
    Kumar, Rakesh
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13223 - 13230
  • [4] Goal-Oriented Visual Semantic Navigation Using Semantic Knowledge Graph and Transformer
    Wang, Zhongli
    Tian, Guohui
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 1647 - 1657
  • [5] Goal-Oriented Visual Semantic Navigation Using Semantic Knowledge Graph and Transformer
    Wang, Zhongli
    Tian, Guohui
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 1647 - 1657
  • [6] Image Captioning With Visual-Semantic Double Attention
    He, Chen
    Hu, Haifeng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (01)
  • [7] A Behavioral Approach to Visual Navigation with Graph Localization Networks
    Chen, Kevin
    Pablo de Vicente, Juan
    Sepulveda, Gabriel
    Xia, Fei
    Soto, Alvaro
    Vazquez, Marynel
    Savarese, Silvio
    ROBOTICS: SCIENCE AND SYSTEMS XV, 2019,
  • [8] Improving semantic search via integrated personalized faceted and visual graph navigation
    Tvarozek, Michal
    Barla, Michal
    Frivolt, Gyoergy
    Tomsa, Marek
    Bielikova, Maria
    SOFSEM 2008: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2008, 4910 : 778 - 789
  • [9] Siamese Graph Attention Networks for robust visual object tracking
    Lu, Junjie
    Li, Shengyang
    Guo, Weilong
    Zhao, Manqi
    Yang, Jian
    Liu, Yunfei
    Zhou, Zhuang
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
  • [10] Multi-Semantic Decoding of Visual Perception with Graph Neural Networks
    Li, Rong
    Li, Jiyi
    Wang, Chong
    Liu, Haoxiang
    Liu, Tao
    Wang, Xuyang
    Zou, Ting
    Huang, Wei
    Yan, Hongmei
    Chen, Huafu
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2024, 34 (04)