Exploring Models and Data for Image Question Answering

被引:0
|
作者
Ren, Mengye [1 ]
Kiros, Ryan [1 ]
Zemel, Richard S. [1 ,2 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Canadian Inst Adv Res, Quebec City, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work aims to address the problem of image-based question-answering (QA) with new models and datasets. In our work, we propose to use neural networks and visual semantic embeddings, without intermediate stages such as object detection and image segmentation, to predict answers to simple questions about images. Our model performs 1.8 times better than the only published results on an existing image QA dataset. We also present a question generation algorithm that converts image descriptions, which are widely available, into QA form. We used this algorithm to produce an order-of-magnitude larger dataset, with more evenly distributed answers. A suite of baseline results on this new dataset are also presented.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] GQA: Grammatical Question Answering for RDF Data
    Zimina, Elizaveta
    Nummenmaa, Jyrki
    Jarvelin, Kalervo
    Peltonen, Jaakko
    Stefanidis, Kostas
    Hyyro, Heikki
    SEMANTIC WEB CHALLENGES, SEMWEBEVAL 2018, 2018, 927 : 82 - 97
  • [32] Crowdsourced Linked Data Question Answering with AQUACOLD
    Collis, Nicholas
    Frommholz, Ingo
    2021 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2021), 2021, : 297 - 298
  • [33] Evaluating question answering over linked data
    Lopez, Vanessa
    Unger, Christina
    Cimiano, Philipp
    Motta, Enrico
    JOURNAL OF WEB SEMANTICS, 2013, 21 : 3 - 13
  • [34] Data Augmentation for Biomedical Factoid Question Answering
    Pappas, Dimitris
    Malakasiotis, Prodromos
    Androutsopoulos, Ion
    PROCEEDINGS OF THE 21ST WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2022), 2022, : 63 - 81
  • [35] Question-Answering for Agricultural Open Data
    Kawamura, Takahiro
    Ohsuga, Akihiko
    TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS XVI, 2014, 8960 : 15 - 28
  • [36] Visual question answering algorithm based on image caption
    Cai, Wenliang
    Qiu, Guoyong
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 2076 - 2079
  • [37] An Introduction to Question Answering over Linked Data
    Unger, Christina
    Freitas, Andre
    Cimiano, Philipp
    REASONING WEB: REASONING ON THE WEB IN THE BIG DATA ERA, 2014, 8714 : 100 - +
  • [38] An introduction to question answering over linked data
    1600, Springer Verlag (8714):
  • [39] Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models
    Ko, Dohwan
    Lee, Ji Soo
    Choi, Miso
    Chu, Jaewon
    Park, Jihwan
    Kim, Hyunwoo J.
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3078 - 3089
  • [40] Hierarchical Question-Image Co-Attention for Visual Question Answering
    Lu, Jiasen
    Yang, Jianwei
    Batra, Dhruv
    Parikh, Devi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29