Quality Estimation for Image Captions Based on Large-scale Human Evaluations

被引:0
|
作者
Levinboim, Tomer [1 ]
Thapliyal, Ashish V. [1 ]
Sharma, Piyush [1 ]
Soricut, Radu [1 ]
机构
[1] Google Res, Venice, CA 90291 USA
关键词
LANGUAGE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic image captioning has improved significantly over the last few years, but the problem is far from being solved, with state of the art models still often producing low quality captions when used in the wild. In this paper, we focus on the task of Quality Estimation (QE) for image captions, which attempts to model the caption quality from a human perspective and without access to ground-truth references, so that it can be applied at prediction time to detect low-quality captions produced on previously unseen images. For this task, we develop a human evaluation process that collects coarse-grained caption annotations from crowdsourced users, which is then used to collect a large scale dataset spanning more than 600k caption quality ratings. We then carefully validate the quality of the collected ratings and establish baseline models for this new QE task. Finally, we further collect fine-grained caption quality annotations from trained raters, and use them to demonstrate that QE models trained over the coarse ratings can effectively detect and filter out low-quality image captions, thereby improving the user experience from captioning systems.
引用
收藏
页码:3157 / 3166
页数:10
相关论文
共 50 条
  • [41] Large-scale infographic image downsizing
    Ma, RH
    Singh, G
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 1661 - 1664
  • [42] Large-Scale Evolution of Image Classifiers
    Real, Esteban
    Moore, Sherry
    Selle, Andrew
    Saxena, Saurabh
    Suematsu, Yutaka Leon
    Tan, Jie
    Le, Quoc, V
    Kurakin, Alex
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [43] STATE ESTIMATION OF LARGE-SCALE SYSTEMS
    CHEN, BS
    LU, HC
    INTERNATIONAL JOURNAL OF CONTROL, 1988, 47 (06) : 1613 - 1632
  • [44] Problems in Large-Scale Image Classification
    Guo, Yuchen
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 5038 - 5039
  • [45] Similarity Estimation for Large-Scale Human Action Video Data on Spark
    Xu, Weihua
    Uddin, Md Azher
    Dolgorsuren, Batjargal
    Akhond, Mostafijur Rahman
    Khan, Kifayat Ullah
    Hossain, Md Ibrahim
    Lee, Young-Koo
    APPLIED SCIENCES-BASEL, 2018, 8 (05):
  • [46] Practical large-scale latency estimation
    Szymaniak, Michal
    Presotto, David
    Pierre, Guillaume
    van Steen, Maarten
    COMPUTER NETWORKS, 2008, 52 (07) : 1343 - 1364
  • [47] Estimation of large-scale dimension densities
    Raab, C
    Kurths, J
    PHYSICAL REVIEW E, 2001, 64 (01): : 5
  • [48] Efficient estimation for large-scale linkage disequilibrium patterns of the human genome
    Huang, Xin
    Zhu, Tian-Neng
    Liu, Ying-Chao
    Qi, Guo-An
    Zhang, Jian-Nan
    Chen, Guo-Bo
    ELIFE, 2023, 12
  • [49] Considerations on large-scale evaluations in Brazil and the role of international organizations: efficiency and productivity x quality
    da Silva Oliveira, Quelli Cristina
    Coelho, Denila
    Castanha, Andre Paulo
    REVISTA ON LINE DE POLITICA E GESTAO EDUCACIONAL, 2015, (19): : 238 - 255
  • [50] Large-Scale Street Space Quality Evaluation Based on Deep Learning Over Street View Image
    Liu, Mei
    Han, Longmei
    Xiong, Shanshan
    Qing, Linbo
    Ji, Haohao
    Peng, Yonghong
    IMAGE AND GRAPHICS, ICIG 2019, PT II, 2019, 11902 : 690 - 701