Multimodal Logical Inference System for Visual-Textual Entailment

被引：0

作者：

Suzuki, Riko ^{[1
]}

Yanaka, Hitomi ^{[1
,2
]}

Yoshikawa, Masashi ^{[3
]}

Mineshima, Koji ^{[1
]}

Bekki, Daisuke ^{[1
]}

机构：

[1] Ochanomizu Univ, Tokyo, Japan

[2] RIKEN Ctr Adv Intelligence Project, Tokyo, Japan

[3] Nara Inst Sci & Technol, Nara, Japan

来源：

57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A large amount of research about multimodal inference across text and vision has been recently developed to obtain visually grounded word and sentence representations. In this paper, we use logic-based representations as unified meaning representations for texts and images and present an unsupervised multimodal logical inference system that can effectively prove entailment relations between them. We show that by combining semantic parsing and theorem proving, the system can handle semantically complex sentences for visual-textual inference.

引用

页码：386 / 392

页数：7

共 50 条

[31] Visual-Textual Encounters with a German Grandfather: The Work of Angela Findlay
Pettitt, Joanne
JEWISH FILM & NEW MEDIA-AN INTERNATIONAL JOURNAL, 2023, 11 (01)
[32] Hybrid Representation and Decision Fusion towards Visual-textual Sentiment
Yin, Chunyong
Zhang, Sun
Zeng, Qingkui
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (03)
[33] Visual-Textual Alignment for Generalizable Person Reidentification in Internet of Things
Liu, Xiaosheng
Zhou, Zhiheng
Niu, Chang
Wu, Qingru
IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (15) : 13865 - 13875
[34] Affective Color Theme Generation System for Visual-textual Design: A Case Study of Banner Design
Qiu, Qianru
Luo, Xuan
Watanabe, Shu
Omura, Kengo
INTERNATIONAL JOURNAL OF AFFECTIVE ENGINEERING, 2019, 18 (03): : 137 - 144
[35] Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues
Dong, Qingxiu
Qin, Ziwei
Xia, Heming
Feng, Tian
Tong, Shoujie
Meng, Haoran
Xu, Lin
Wei, Zhongyu
Zhan, Weidong
Chang, Baobao
Li, Sujian
Liu, Tianyu
Sui, Zuifang
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 932 - 946
[36] HTCN: Harmonious Text Colorization Network for Visual-Textual Presentation Design
Yang, Xuyong
Xu, Xiaobin
Huang, Yaohong
Yu, Nenghai
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 560 - 571
[37] A Novel Visual-Textual Sentiment Analysis Framework for Social Media Data
Jindal, Kanika
Aron, Rajni
COGNITIVE COMPUTATION, 2021, 13 (06) : 1433 - 1450
[38] THE BASIC SYSTEM OF LOGICAL ENTAILMENT IS A TARSKIAN DEDUCTIVE SYSTEM
MUSKARDIN, V
JOURNAL OF SYMBOLIC LOGIC, 1987, 52 (01) : 333 - 333
[39] Nonlinear Discrete Cross-Modal Hashing for Visual-Textual Data
Ma, Dekui
Liang, Jian
He, Ran
Kong, Xiangwei
IEEE MULTIMEDIA, 2017, 24 (02) : 56 - 65
[40] Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Yauney, Gregory
Hessel, Jack
Mimno, David
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2039 - 2045

← 1 2 3 4 5 →