Exploring Models and Data for Image Question Answering

被引:0
|
作者
Ren, Mengye [1 ]
Kiros, Ryan [1 ]
Zemel, Richard S. [1 ,2 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Canadian Inst Adv Res, Quebec City, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work aims to address the problem of image-based question-answering (QA) with new models and datasets. In our work, we propose to use neural networks and visual semantic embeddings, without intermediate stages such as object detection and image segmentation, to predict answers to simple questions about images. Our model performs 1.8 times better than the only published results on an existing image QA dataset. We also present a question generation algorithm that converts image descriptions, which are widely available, into QA form. We used this algorithm to produce an order-of-magnitude larger dataset, with more evenly distributed answers. A suite of baseline results on this new dataset are also presented.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Exploring Chart Question Answering for Blind and Low Vision Users
    Kim, Jiho
    Srinivasan, Arjun
    Kim, Nam Wook
    Kim, Yea-Seul
    PROCEEDINGS OF THE 2023 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2023), 2023,
  • [42] Exploring User Expertise and Descriptive Ability in Community Question Answering
    Yang, Baoguo
    Manandhar, Suresh
    2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), 2014, : 320 - 327
  • [43] Exploring and exploiting model uncertainty for robust visual question answering
    Zhang, Xuesong
    He, Jun
    Zhao, Jia
    Hu, Zhenzhen
    Yang, Xun
    Li, Jia
    Hong, Richang
    MULTIMEDIA SYSTEMS, 2024, 30 (06)
  • [44] Explicit Bias Discovery in Visual Question Answering Models
    Manjunatha, Varun
    Saini, Nirat
    Davis, Larry S.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9554 - 9563
  • [45] Soft pattern matching models for definitional question answering
    Cui, Hang
    Kan, Min-Yen
    Chua, Tatseng
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2007, 25 (02)
  • [46] Exploring Question Selection Bias to Identify Experts and Potential Experts in Community Question Answering
    Pal, Aditya
    Harper, F. Maxwell
    Konstan, Joseph A.
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2012, 30 (02)
  • [47] Learning Distributed Representations of Data in Community Question Answering for Question Retrieval
    Zhang, Kai
    Wu, Wei
    Wang, Fang
    Zhou, Ming
    Li, Zhoujun
    PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16), 2016, : 533 - 542
  • [48] Reasoning with large language models for medical question answering
    Lucas, Mary M.
    Yang, Justin
    Pomeroy, Jon K.
    Yang, Christopher C.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09)
  • [49] User authority ranking models for community question answering
    Raoa, Yanghui
    Xie, Haoran
    Liu, Xuebo
    Li, Qing
    Wang, Fu Lee
    Wong, Tak-Lam
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (05) : 2533 - 2542
  • [50] Calibrated Large Language Models for Binary Question Answering
    Giovannotti, Patrizio
    Gammerman, Alex
    13TH SYMPOSIUM ON CONFORMAL AND PROBABILISTIC PREDICTION WITH APPLICATIONS, 2024, 230 : 218 - 235