Multimodal Logical Inference System for Visual-Textual Entailment

被引:0
|
作者
Suzuki, Riko [1 ]
Yanaka, Hitomi [1 ,2 ]
Yoshikawa, Masashi [3 ]
Mineshima, Koji [1 ]
Bekki, Daisuke [1 ]
机构
[1] Ochanomizu Univ, Tokyo, Japan
[2] RIKEN Ctr Adv Intelligence Project, Tokyo, Japan
[3] Nara Inst Sci & Technol, Nara, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large amount of research about multimodal inference across text and vision has been recently developed to obtain visually grounded word and sentence representations. In this paper, we use logic-based representations as unified meaning representations for texts and images and present an unsupervised multimodal logical inference system that can effectively prove entailment relations between them. We show that by combining semantic parsing and theorem proving, the system can handle semantically complex sentences for visual-textual inference.
引用
收藏
页码:386 / 392
页数:7
相关论文
共 50 条
  • [1] Recognising textual entailment with robust logical inference
    Bos, Johan
    Markert, Katja
    MACHINE LEARNING CHALLENGES: EVALUATING PREDICTIVE UNCERTAINTY VISUAL OBJECT CLASSIFICATION AND RECOGNIZING TEXTUAL ENTAILMENT, 2006, 3944 : 404 - 426
  • [2] Recognizing Textual Entailment with Deep-Shallow Semantic Analysis and Logical Inference
    Wotzlaw, Andreas
    Coote, Ravi
    SEMAPRO 2010: THE FOURTH INTERNATIONAL CONFERENCE ON ADVANCES IN SEMANTIC PROCESSING, 2010, : 118 - 125
  • [3] A multimodal fusion network with attention mechanisms for visual-textual sentiment analysis
    Gan, Chenquan
    Fu, Xiang
    Feng, Qingdong
    Zhu, Qingyi
    Cao, Yang
    Zhu, Ye
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 242
  • [4] Multimodal visual-textual object graph attention network for propaganda detection in memes
    Chen, Pengyuan
    Zhao, Lei
    Piao, Yangheran
    Ding, Hongwei
    Cui, Xiaohui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 36629 - 36644
  • [5] Multimodal visual-textual object graph attention network for propaganda detection in memes
    Pengyuan Chen
    Lei Zhao
    Yangheran Piao
    Hongwei Ding
    Xiaohui Cui
    Multimedia Tools and Applications, 2024, 83 : 36629 - 36644
  • [6] A Better Loss for Visual-Textual Grounding
    Rigoni, Davide
    Serafini, Luciano
    Sperduti, Alessandro
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 49 - 57
  • [7] Relational Visual-Textual Information Retrieval
    Messina, Nicola
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2020, 2020, 12440 : 405 - 411
  • [8] Recognizing Textual Entailment Using Inference Phenomenon
    Ren, Han
    Li, Xia
    Feng, Wenhe
    Wan, Jing
    CHINESE LEXICAL SEMANTICS, CLSW 2017, 2018, 10709 : 293 - 302
  • [9] VISUAL-TEXTUAL SENTIMENT ANALYSIS IN PRODUCT REVIEWS
    Ye, Jin
    Peng, Xiaojiang
    Qiao, Yu
    Xing, Hao
    Li, Junli
    Ji, Rongrong
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 869 - 873
  • [10] Visual-Textual Semantic Alignment Network for Visual Question Answering
    Tian, Weidong
    Zhang, Yuzheng
    He, Bin
    Zhu, Junjun
    Zhao, Zhongqiu
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 259 - 270