Dynamic Memory Networks for Visual and Textual Question Answering

被引:0
|
作者
Xiong, Caiming [1 ]
Merity, Stephen [1 ]
Socher, Richard [1 ]
机构
[1] Salesforce Inc, San Francisco, CA 94105 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering. One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks. However, it was not shown whether the architecture achieves strong results for question answering when supporting facts are not marked during training or whether it could be applied to other modalities such as images Based on an analysis of the DMN, we propose several improvements to its memory and input modules. Together with these changes we introduce a novel input module for images in order to be able to answer visual questions. Our new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the bAbI-10k text question-answering dataset without supporting fact supervision.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Dynamic Co-attention Network for Visual Question Answering
    Ebaid, Doaa B.
    Madbouly, Magda M.
    El-Zoghabi, Adel A.
    2021 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2021), 2021, : 125 - 129
  • [32] Regularizing Attention Networks for Anomaly Detection in Visual Question Answering
    Lee, Doyup
    Cheon, Yeongjae
    Han, Wook-Shin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1845 - 1853
  • [33] Multi-view Attention Networks for Visual Question Answering
    Li, Min
    Bai, Zongwen
    Deng, Jie
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 788 - 794
  • [34] Transformer Module Networks for Systematic Generalization in Visual Question Answering
    Yamada, Moyuru
    D'Amario, Vanessa
    Takemoto, Kentaro
    Boix, Xavier
    Sasaki, Tomotake
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10096 - 10105
  • [35] A survey on textual entailment based question answering
    Paramasivam, Aarthi
    Nirmala, S. Jaya
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 9644 - 9653
  • [36] Textbook Question Answering under Instructor Guidance with Memory Networks
    Li, Juzheng
    Su, Hang
    Zhu, Jun
    Wang, Siyu
    Zhang, Bo
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3655 - 3663
  • [37] Path-Wise Attention Memory Network for Visual Question Answering
    Xiang, Yingxin
    Zhang, Chengyuan
    Han, Zhichao
    Yu, Hao
    Li, Jiaye
    Zhu, Lei
    MATHEMATICS, 2022, 10 (18)
  • [38] VQA: Visual Question Answering
    Antol, Stanislaw
    Agrawal, Aishwarya
    Lu, Jiasen
    Mitchell, Margaret
    Batra, Dhruv
    Zitnick, C. Lawrence
    Parikh, Devi
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2425 - 2433
  • [39] Textual Entailment in Legal Bar Exam Question Answering Using Deep Siamese Networks
    Kim, Mi-Young
    Lu, Yao
    Goebel, Randy
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE (JSAI-ISAI 2017), 2018, 10838 : 35 - 48
  • [40] Indic Visual Question Answering
    Chandrasekar, Aditya
    Shimpi, Amey
    Naik, Dinesh
    2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM, 2022,