Exploring Models and Data for Image Question Answering

被引：0

作者：

Ren, Mengye ^{[1
]}

Kiros, Ryan ^{[1
]}

Zemel, Richard S. ^{[1
,2
]}

机构：

[1] Univ Toronto, Toronto, ON, Canada

[2] Canadian Inst Adv Res, Quebec City, PQ, Canada

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015) | 2015年 / 28卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work aims to address the problem of image-based question-answering (QA) with new models and datasets. In our work, we propose to use neural networks and visual semantic embeddings, without intermediate stages such as object detection and image segmentation, to predict answers to simple questions about images. Our model performs 1.8 times better than the only published results on an existing image QA dataset. We also present a question generation algorithm that converts image descriptions, which are widely available, into QA form. We used this algorithm to produce an order-of-magnitude larger dataset, with more evenly distributed answers. A suite of baseline results on this new dataset are also presented.

引用

页数：9

共 50 条

[1] MoReVQA: Exploring Modular Reasoning Models for Video Question Answering
Min, Juhong
Buchl, Shyamal
Nagrani, Arsha
Cho, Minsu
Schm, Cordelia
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13235 - 13245
[2] Training Question Answering Models From Synthetic Data
Puri, Raul
Spring, Ryan
Shoeybi, Mohammad
Patwary, Mostofa
Catanzaro, Bryan
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5811 - 5826
[3] Exploring answer information for question classification in community question answering
Wang, Jian (wangjian@dlut.edu.cn), 2018, Old City Publishing (31) : 1 - 2
[4] Exploring Answer Information for Question Classification in Community Question Answering
Wang, Jian
Lin, Hongfei
Dong, Hualei
Xiong, Daping
Yang, Zhihao
JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2018, 31 (1-2) : 67 - 84
[5] Exploring Entities in Event Detection as Question Answering
Boros, Emanuela
Moreno, Jose G.
Doucet, Antoine
ADVANCES IN INFORMATION RETRIEVAL, PT I, 2022, 13185 : 65 - 79
[6] Exploring syntactic relation patterns for question answering
Shen, D
Kruijff, GJM
Klakow, D
NATURAL LANGUAGE PROCESSING - IJCNLP 2005, PROCEEDINGS, 2005, 3651 : 507 - 518
[7] Image captioning improved visual question answering
Himanshu Sharma
Anand Singh Jalal
Multimedia Tools and Applications, 2022, 81 : 34775 - 34796
[8] Data Augmentation Method for Question Answering
Ding J.
Xiao K.
Ye H.
Zhou X.
Zhang M.
Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58 (01): : 54 - 60
[9] Image captioning improved visual question answering
Sharma, Himanshu
Jalal, Anand Singh
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34775 - 34796
[10] Multimodal deep fusion for image question answering
Zhang, Weifeng
Yu, Jing
Wang, Yuxia
Wang, Wei
KNOWLEDGE-BASED SYSTEMS, 2021, 212

← 1 2 3 4 5 →