Double Graph Attention Networks for Visual Semantic Navigation

被引:2
|
作者
Lyu, Yunlian [1 ,2 ]
Talebi, Mohammad Sadegh [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Xiyuan Ave, Chengdu 611731, Sichuan, Peoples R China
[2] Univ Copenhagen, Dept Comp Sci, Univ Pk 1, DK-2100 Copenhagen, Denmark
关键词
Deep reinforcement learning; Visual navigation; Knowledge graph; Graph convolutional networks; Spatial attention; REINFORCEMENT; LANGUAGE; ROBOT;
D O I
10.1007/s11063-023-11190-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial Intelligence (AI) based on knowledge graphs has been invested in realizing human intelligence like thinking, learning, and logical reasoning. It is a great promise to make AI-based systems not only intelligent but also knowledgeable. In this paper, we investigate knowledge graph based visual semantic navigation using deep reinforcement learning, where an agent reasons actions against targets specified by text words in indoor scenes. The agent perceives its surroundings through egocentric RGB views and learns via trial-and-error. The fundamental problem of visual navigation is efficient learning across different targets and scenes. To obtain an empirical model, we propose a spatial attention model with knowledge graphs, DGVN, which combines both semantic information about observed objects and spatial information about their locations. Our spatial attention model is constructed based on interactions between a 3D global graph and local graphs. The two graphs we adopted encode the spatial relationships between objects and are expected to guide policy search effectively. With the knowledge graph and its robust feature representation using graph convolutional networks, we demonstrate that our agent is able to infer a more plausible attention mechanism for decision-making. Under several experimental metrics, our attention model is shown to achieve superior navigation performance in the AI2-THOR environment.
引用
收藏
页码:9019 / 9040
页数:22
相关论文
共 50 条
  • [21] An aspect sentiment classification model for graph attention networks incorporating syntactic, semantic, and knowledge
    Zhang, Siyu
    Gong, Hongfang
    She, Lina
    KNOWLEDGE-BASED SYSTEMS, 2023, 275
  • [22] Visual Graph Memory with Unsupervised Representation for Visual Navigation
    Kwon, Obin
    Kim, Nuri
    Choi, Yunho
    Yoo, Hwiyeon
    Park, Jeongho
    Oh, Songhwai
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15870 - 15879
  • [23] Object semantic-guided graph attention feature fusion network for Siamese visual tracking
    Zhang, Jianwei
    Miao, Mengen
    Zhang, Huanlong
    Wang, Jingchao
    Zhao, Yanchun
    Chen, Zhiwu
    Qiao, Jianwei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [24] Visual-Semantic Graph Matching for Visual Grounding
    Jing, Chenchen
    Wu, Yuwei
    Pei, Mingtao
    Hu, Yao
    Jia, Yunde
    Wu, Qi
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4041 - 4050
  • [25] Toward Semantic Visual Attention Models
    Kucerova, J.
    Haladova, Z.
    PERCEPTION, 2013, 42 : 219 - 219
  • [26] Semantic Contextual Cuing and Visual Attention
    Goujon, Annabelle
    Didierjean, Andre
    Marmeche, Evelyne
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2009, 35 (01) : 50 - 71
  • [27] Graph Ordering Attention Networks
    Chatzianastasis, Michail
    Lutzeyer, Johannes
    Dasoulas, George
    Vazirgiannis, Michalis
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7006 - 7014
  • [28] A REGULARIZED ATTENTION MECHANISM FOR GRAPH ATTENTION NETWORKS
    Shanthamallu, Uday Shankar
    Jayaraman, J. Thiagarajan
    Spanias, Andreas
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3372 - 3376
  • [29] Sparse Graph Attention Networks
    Ye, Yang
    Ji, Shihao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 905 - 916
  • [30] Graph Oriented Attention Networks
    Amine, Ouardi
    Mestari, Mohammed
    IEEE ACCESS, 2024, 12 : 47057 - 47067