Syntax Tree Constrained Graph Network for Visual Question Answering

被引:0
|
作者
Su, Xiangrui [1 ]
Zhang, Qi [2 ,3 ]
Shi, Chongyang [1 ]
Liu, Jiachang [1 ]
Hu, Liang [2 ,3 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Tongji Univ, Shanghai, Peoples R China
[3] DeepBlue Acad Sci, Shanghai, Peoples R China
关键词
Visual question answering; Syntax tree; Message passing; Tree convolution; Graph neural network;
D O I
10.1007/978-981-99-8073-4_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Question Answering (VQA) aims to automatically answer natural language questions related to given image content. Existing VQA methods integrate vision modeling and language understanding to explore the deep semantics of the question. However, these methods ignore the significant syntax information of the question, which plays a vital role in understanding the essential semantics of the question and guiding the visual feature refinement. To fill the gap, we suggested a novel Syntax Tree Constrained Graph Network (STCGN) for VQA based on entity message passing and syntax tree. This model is able to extract a syntax tree from questions and obtain more precise syntax information. Specifically, we parse questions and obtain the question syntax tree using the Stanford syntax parsing tool. From the word level and phrase level, syntactic phrase features and question features are extracted using a hierarchical tree convolutional network. We then design a message-passing mechanism for phrase-aware visual entities and capture entity features according to a given visual context. Extensive experiments on VQA2.0 datasets demonstrate the superiority of our proposed model.
引用
收藏
页码:122 / 136
页数:15
相关论文
共 50 条
  • [21] An Answer FeedBack Network for Visual Question Answering
    Tian, Weidong
    Tian, Ruihua
    Zhao, Zhongqiu
    Ren, Quan
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [22] Progressive Graph Attention Network for Video Question Answering
    Peng, Liang
    Yang, Shuangji
    Bin, Yi
    Wang, Guoqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2871 - 2879
  • [23] VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering
    Bolanos, Marc
    Peris, Alvaro
    Casacuberta, Francisco
    Radeva, Petia
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2017), 2017, 10255 : 372 - 380
  • [24] Learning Conditioned Graph Structures for Interpretable Visual Question Answering
    Norcliffe-Brown, Will
    Vafeias, Efstathios
    Parisot, Sarah
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [25] Graph neural networks for visual question answering: a systematic review
    Abdulganiyu Abdu Yusuf
    Chong Feng
    Xianling Mao
    Ramadhani Ally Duma
    Mohammed Salah Abood
    Abdulrahman Hamman Adama Chukkol
    Multimedia Tools and Applications, 2024, 83 : 55471 - 55508
  • [26] Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
    Saqur, Raeid
    Narasimhan, Karthik
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [27] Graph neural networks for visual question answering: a systematic review
    Yusuf, Abdulganiyu Abdu
    Feng, Chong
    Mao, Xianling
    Ally Duma, Ramadhani
    Abood, Mohammed Salah
    Chukkol, Abdulrahman Hamman Adama
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 55471 - 55508
  • [28] Fusing Multi-graph Structures for Visual Question Answering
    Hu, Yuncong
    Zhang, Ru
    Liu, Jianyi
    Yan, Dong
    ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2023, 19 : 13 - 13
  • [29] Reinforcement Learning Inference Techniques for Knowledge Graph Constrained Question Answering
    Bi X.
    Nie H.-J.
    Zhao X.-G.
    Yuan Y.
    Wang G.-R.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (10):
  • [30] Co-Attention Network With Question Type for Visual Question Answering
    Yang, Chao
    Jiang, Mengqi
    Jiang, Bin
    Zhou, Weixin
    Li, Keqin
    IEEE ACCESS, 2019, 7 : 40771 - 40781