Reasoning Visual Dialogs with Structural and Partial Observations

被引:74
|
作者
Zheng, Zilong [1 ]
Wang, Wenguan [1 ,2 ]
Qi, Siyuan [1 ,3 ]
Zhu, Song-Chun [1 ,3 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90095 USA
[2] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[3] Int Ctr AI & Robot Auton CARA, Los Angeles, CA USA
关键词
D O I
10.1109/CVPR.2019.00683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel model to address the task of Visual Dialog which exhibits complex dialog structures. To obtain a reasonable answer based on the current question and the dialog history, the underlying semantic dependencies between dialog entities are essential. In this paper we explicitly formalize this task as inference in a graphical model with partially observed nodes and unknown graph structures (relations in dialog). The given dialog entities are viewed as the observed nodes. The answer to a given question is represented by a node with missing value. We first introduce an Expectation Maximization algorithm to infer both the underlying dialog structures and the missing node values (desired answers). Based on this, we proceed to propose a differentiable graph neural network (GNN) solution that approximates this process. Experiment results on the VisDial and VisDial-Q datasets show that our model outperforms comparative methods. It is also observed that our method can infer the underlying dialog structure for better dialog reasoning.
引用
收藏
页码:3662 / 6671
页数:3010
相关论文
共 50 条
  • [21] Intelligent Visual Reasoning Tutor: an Intelligent Tutoring System for Visual Reasoning in Engineering & Architecture
    Kim, Yong Se
    Wang, Eric
    INTERNATIONAL JOURNAL OF ENGINEERING EDUCATION, 2009, 25 (04) : 701 - 711
  • [22] Observations, Simulations, and Reasoning in Astrophysics
    Jacquart, Melissa
    PHILOSOPHY OF SCIENCE, 2020, 87 (05) : 1209 - 1220
  • [23] Transformation Driven Visual Reasoning
    Hong, Xin
    Lan, Yanyan
    Pang, Liang
    Guo, Jiafeng
    Cheng, Xueqi
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6899 - 6908
  • [24] THE ROLE OF VISUAL IMAGERY IN REASONING
    Bowers, Henry
    BRITISH JOURNAL OF PSYCHOLOGY-GENERAL SECTION, 1935, 25 : 436 - 446
  • [25] Visual Explanations of Probabilistic Reasoning
    Erwig, Martin
    Walkingshaw, Eric
    2009 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING, PROCEEDINGS, 2009, : 23 - 27
  • [26] Visual Reasoning in Science and Mathematics
    Bueno, Otavio
    MODEL-BASED REASONING IN SCIENCE AND TECHNOLOGY: LOGICAL, EPISTEMOLOGICAL, AND COGNITIVE ISSUES, 2016, 27 : 3 - 19
  • [27] A Benchmark for Compositional Visual Reasoning
    Zerroug, Aimen
    Vaishnav, Mohit
    Colin, Julien
    Musslick, Sebastian
    Serre, Thomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [28] Interpretable visual reasoning: A survey
    He, Feijuan
    Wang, Yaxian
    Miao, Xianglin
    Sun, Xia
    IMAGE AND VISION COMPUTING, 2021, 112
  • [29] Interpretable visual reasoning: A survey
    He, Feijuan
    Wang, Yaxian
    Miao, Xianglin
    Sun, Xia
    Image and Vision Computing, 2021, 112
  • [30] Intelligent visual reasoning tutor
    Wang, E
    Kim, YS
    5th IEEE International Conference on Advanced Learning Technologies, Proceedings, 2005, : 511 - 515