Dynamic Memory Networks for Visual and Textual Question Answering

被引:0
|
作者
Xiong, Caiming [1 ]
Merity, Stephen [1 ]
Socher, Richard [1 ]
机构
[1] Salesforce Inc, San Francisco, CA 94105 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering. One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks. However, it was not shown whether the architecture achieves strong results for question answering when supporting facts are not marked during training or whether it could be applied to other modalities such as images Based on an analysis of the DMN, we propose several improvements to its memory and input modules. Together with these changes we introduce a novel input module for images in order to be able to answer visual questions. Our new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the bAbI-10k text question-answering dataset without supporting fact supervision.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Enhanced question understanding with dynamic memory networks for textual question answering
    Yue, Chunyi
    Cao, Hanqiang
    Xiong, Kun
    Cui, Anqi
    Qin, Haocheng
    Li, Ming
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 : 39 - 45
  • [2] Visual Question Answering using Hierarchical Dynamic Memory Networks
    Shang, Jiayu
    Li, Shiren
    Duan, Zhikui
    Huang, Junwei
    NINTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2017), 2018, 10615
  • [3] Learning Visual Knowledge Memory Networks for Visual Question Answering
    Su, Zhou
    Zhu, Chen
    Dong, Yinpeng
    Cai, Dongqi
    Chen, Yurong
    Li, Jianguo
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7736 - 7745
  • [4] Visual Question Answering with Memory-Augmented Networks
    Ma, Chao
    Shen, Chunhua
    Dick, Anthony
    Wu, Qi
    Wang, Peng
    van den Hengel, Anton
    Reid, Ian
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6975 - 6984
  • [5] Visual Question Answering with Textual Representations for Images
    Hirota, Yusuke
    Garcia, Noa
    Otani, Mayu
    Chu, Chenhui
    Nakashima, Yuta
    Taniguchi, Ittetsu
    Onoye, Takao
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3147 - 3150
  • [6] Visual-Textual Semantic Alignment Network for Visual Question Answering
    Tian, Weidong
    Zhang, Yuzheng
    He, Bin
    Zhu, Junjun
    Zhao, Zhongqiu
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 259 - 270
  • [7] Multi visual and textual embedding on visual question answering for blind people
    Tung Le
    Huy Tien Nguyen
    Minh Le Nguyen
    NEUROCOMPUTING, 2021, 465 : 451 - 464
  • [8] Differential Networks for Visual Question Answering
    Wu, Chenfei
    Liu, Jinlai
    Wang, Xiaojie
    Li, Ruifan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8997 - 9004
  • [9] Ask me: A Question Answering System via Dynamic Memory Networks
    Yigit, Gulsum
    Amasyali, Mehmet Fatih
    2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 128 - 132
  • [10] Movie Question Answering via Textual Memory and Plot Graph
    Han, Yahong
    Wang, Bo
    Hong, Richang
    Wu, Fei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (03) : 875 - 887