PRIOR VISUAL RELATIONSHIP REASONING FOR VISUAL QUESTION ANSWERING

被引:0
|
作者
Yang, Zhuoqian [1 ,2 ]
Qin, Zengchang [2 ,3 ]
Yu, Jing [4 ]
Wan, Tao [5 ]
机构
[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
[2] Beihang Univ, Sch ASEE, Intelligent Comp & Machine Learning Lab, Beijing, Peoples R China
[3] Codemao, AI Res, Shenzhen, Peoples R China
[4] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[5] Beihang Univ, Sch Biol Sci & Med Engn, Beijing, Peoples R China
关键词
VQA; GCN; Attention Mechanism;
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Visual Question Answering (VQA) is a representative task of cross-modal reasoning where an image and a free-form question in natural language are presented and the correct answer needs to be determined using both visual and textual information. One of the key issues of VQA is to reason with semantic clues in the visual content under the guidance of the question. In this paper, we propose Scene Graph Convolutional Network (SceneGCN) to jointly reason the object properties and their semantic relations for the correct answer. The visual relationship is projected into a deep learned semantic space constrained by visual context and language priors. Based on comprehensive experiments on two challenging datasets: GQA and VQA 2.0, we demonstrate the effectiveness and interpretability of the new model.
引用
收藏
页码:1411 / 1415
页数:5
相关论文
共 50 条
  • [21] LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering
    Gao, Jingying
    Wu, Qi
    Blair, Alan
    Pagnucco, Maurice
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] Explicit Knowledge-based Reasoning for Visual Question Answering
    Wang, Peng
    Wu, Qi
    Shen, Chunhua
    Dick, Anthony
    van den Hengel, Anton
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1290 - 1296
  • [23] Towards Reasoning Ability in Scene Text Visual Question Answering
    Wang, Qingqing
    Xiao, Liqiang
    Lu, Yue
    Jin, Yaohui
    He, Hao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2281 - 2289
  • [24] An effective spatial relational reasoning networks for visual question answering
    Shen, Xiang
    Han, Dezhi
    Chen, Chongqing
    Luo, Gaofeng
    Wu, Zhongdai
    PLOS ONE, 2022, 17 (11):
  • [25] A Symbolic-Neural Reasoning Model for Visual Question Answering
    Gao, Jingying
    Blair, Alan
    Pagnucco, Maurice
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [26] Comprehensive-perception dynamic reasoning for visual question answering
    Shuang, Kai
    Guo, Jinyu
    Wang, Zihan
    PATTERN RECOGNITION, 2022, 131
  • [27] Semantic Relation Graph Reasoning Network for Visual Question Answering
    Lan, Hong
    Zhang, Pufen
    TWELFTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2021, 11719
  • [28] Weakly Supervised Relative Spatial Reasoning for Visual Question Answering
    Banerjee, Pratyay
    Gokhale, Tejas
    Yang, Yezhou
    Baral, Chitta
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1888 - 1898
  • [29] Research on Visual Question Answering Based on GAT Relational Reasoning
    Miao, Yalin
    Cheng, Wenfang
    He, Shuyun
    Jiang, Hui
    NEURAL PROCESSING LETTERS, 2022, 54 (02) : 1435 - 1448
  • [30] Research on Visual Question Answering Based on GAT Relational Reasoning
    Yalin Miao
    Wenfang Cheng
    Shuyun He
    Hui Jiang
    Neural Processing Letters, 2022, 54 : 1435 - 1448