IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引:0
|
作者
Huang, Yiqing [1 ]
Li, Cong [1 ]
Li, Tianpeng [1 ]
Wan, Weitao [1 ]
Chen, Jiansheng [1 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年
基金
中国国家自然科学基金;
关键词
Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;
D O I
10.1109/icip.2019.8803108
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.
引用
收藏
页码:1820 / 1824
页数:5
相关论文
共 50 条
  • [21] COLLOQUIAL IMAGE CAPTIONING
    Ge, Xuri
    Chen, Fuhai
    Shen, Chen
    Ji, Rongrong
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 356 - 361
  • [22] Automated image captioning
    Puscasiu, Adela
    Fanca, Alexandra
    Gota, Dan-Ioan
    Valean, Honoriu
    PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR), 2020, : 361 - 366
  • [23] Automatic image captioning
    Pan, JY
    Yang, HJ
    Duygulu, P
    Faloutsos, C
    2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1987 - 1990
  • [24] Image/video captioning
    Ushiku Y.
    Ushiku, Yoshitaka, 2018, Inst. of Image Information and Television Engineers (72): : 650 - 654
  • [25] GAF-Net: Global view guided attribute fusion network for remote sensing image captioning
    Peng, Yuqing
    Jia, Yamin
    Chen, Jiao
    Ji, Xinhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 22409 - 22431
  • [26] GAF-Net: Global view guided attribute fusion network for remote sensing image captioning
    Yuqing Peng
    Yamin Jia
    Jiao Chen
    Xinhao Ji
    Multimedia Tools and Applications, 2024, 83 : 22409 - 22431
  • [27] Attribute-Driven Filtering: A new attributes predicting approach for fine-grained image captioning
    Hossen, Md. Bipul
    Ye, Zhongfu
    Abdussalam, Amr
    Ul Hassan, Shabih
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [28] Video Captioning based on Image Captioning as Subsidiary Content
    Vaishnavi, J.
    Narmatha, V
    2022 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL, COMPUTING, COMMUNICATION AND SUSTAINABLE TECHNOLOGIES (ICAECT), 2022,
  • [29] Hierarchy Parsing for Image Captioning
    Yao, Ting
    Pan, Yingwei
    Li, Yehao
    Mei, Tao
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2621 - 2629
  • [30] Attention on Attention for Image Captioning
    Huang, Lun
    Wang, Wenmin
    Chen, Jie
    Wei, Xiao-Yong
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4633 - 4642