From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

被引:17
|
作者
Li, Jiangtong [1 ]
Niu, Li [1 ]
Zhang, Liqing [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
基金
国家重点研发计划; 美国国家科学基金会;
关键词
D O I
10.1109/CVPR52688.2022.02059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video understanding has achieved great success in representation learning, such as video caption, video object grounding, and video descriptive question-answer. However, current methods still struggle on video reasoning, including evidence reasoning and commonsense reasoning. To facilitate deeper video understanding towards video reasoning, we present the task of Causal-VidQA, which includes four types of questions ranging from scene description (description) to evidence reasoning (explanation) and commonsense reasoning (prediction and counterfactual). For commonsense reasoning, we set up a two-step solution by answering the question and providing a proper reason. Through extensive experiments on existing VideoQA methods, we find that the state-of-the-art methods are strong in descriptions but weak in reasoning. We hope that Causal-VidQA can guide the research of video understanding from representation learning to deeper reasoning. The dataset and related resources are available at https://github.com/bcmi/Causal-VidQA.git.
引用
收藏
页码:21241 / 21250
页数:10
相关论文
共 50 条
  • [1] PathReasoner: Explainable reasoning paths for commonsense question answering
    Zhan, Xunlin
    Huang, Yinya
    Dong, Xiao
    Cao, Qingxing
    Liang, Xiaodan
    KNOWLEDGE-BASED SYSTEMS, 2022, 235
  • [2] PathReasoner: Explainable reasoning paths for commonsense question answering
    Zhan, Xunlin
    Huang, Yinya
    Dong, Xiao
    Cao, Qingxing
    Liang, Xiaodan
    Knowledge-Based Systems, 2022, 235
  • [3] Conversational AI : Open Domain Question Answering and Commonsense Reasoning
    Basu, Kinjal
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2019, (306): : 396 - 402
  • [4] Choice-Driven Contextual Reasoning for Commonsense Question Answering
    Deng, Wenqing
    Wang, Zhe
    Wang, Kewen
    Zhang, Xiaowang
    Feng, Zhiyong
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 335 - 346
  • [5] Video Question Answering With Semantic Disentanglement and Reasoning
    Liu, Jin
    Wang, Guoxiang
    Xie, Jialong
    Zhou, Fengyu
    Xu, Huijuan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3663 - 3673
  • [6] Heterogeneous-Graph Reasoning With Context Paraphrase for Commonsense Question Answering
    Wang, Yujie
    Zhang, Hu
    Liang, Jiye
    Li, Ru
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3759 - 3770
  • [7] Meta-path reasoning of knowledge graph for commonsense question answering
    Zhang, Miao
    He, Tingting
    Dong, Ming
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (01)
  • [8] Retrieval-Augmented Knowledge Graph Reasoning for Commonsense Question Answering
    Sha, Yuchen
    Feng, Yujian
    He, Miao
    Liu, Shangdong
    Ji, Yimu
    MATHEMATICS, 2023, 11 (15)
  • [9] Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
    Wang, Yujie
    Zhang, Hu
    Liang, Jiye
    Li, Ru
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14048 - 14063
  • [10] Knowledge Graph Question-Answering Based on Link Reasoning for Electrical Equipment
    Xin, Rui
    Zhang, Pengfei
    Chen, Xi
    Peng, Jiao
    Liu, Haifeng
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON POWER ELECTRONICS AND ARTIFICIAL INTELLIGENCE, PEAI 2024, 2024, : 594 - 600