IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引:0
|
作者
Huang, Yiqing [1 ]
Li, Cong [1 ]
Li, Tianpeng [1 ]
Wan, Weitao [1 ]
Chen, Jiansheng [1 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;
D O I
10.1109/icip.2019.8803108
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.
引用
收藏
页码:1820 / 1824
页数:5
相关论文
共 50 条
  • [1] ATTRIBUTE CONDITIONED FASHION IMAGE CAPTIONING
    Cai, Chen
    Yap, Kim-Hui
    Wang, Suchen
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1921 - 1925
  • [2] Distinctive-Attribute Extraction for Image Captioning
    Kim, Boeun
    Lee, Young Han
    Jung, Hyedong
    Cho, Choongsang
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 133 - 144
  • [3] Toward Attribute-Controlled Fashion Image Captioning
    Cai, Chen
    Yap, Kim-Hui
    Wang, Suchen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (09)
  • [4] Transformer with token attention and attribute prediction for image captioning
    Song, Lifei
    Wang, Ying
    Shi, Linsu
    Yu, Jiazhong
    Li, Fei
    Xiang, Shiming
    PATTERN RECOGNITION LETTERS, 2025, 188 : 74 - 80
  • [5] Controllable Image Captioning with Feature Refinement and Multilayer Fusion
    Du, Sen
    Zhu, Hong
    Zhang, Yujia
    Wang, Dong
    Shi, Jing
    Xing, Nan
    Lin, Guangfeng
    Zhou, Huiyu
    APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [6] Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
    Wang, Yufei
    Lin, Zhe
    Shen, Xiaohui
    Cohen, Scott
    Cottrell, Garrison W.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7378 - 7387
  • [7] A Text-Guided Generation and Refinement Model for Image Captioning
    Wang, Depeng
    Hu, Zhenzhen
    Zhou, Yuanen
    Hong, Richang
    Wang, Meng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2966 - 2977
  • [8] Feature refinement and rethinking attention for remote sensing image captioning
    Li, Yunpeng
    Tao, Chengjin
    Liu, Meng
    Zhang, Xiangrong
    Wang, Guanchun
    Zhang, Tianyang
    Zhao, Dong
    Wang, Dabao
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [9] Attribute assisted teacher-critical training strategies for image captioning
    Huang, Yiqing
    Chen, Jiansheng
    Ma, Huimin
    Ma, Hongbing
    Ouyang, Wanli
    Yu, Cheng
    NEUROCOMPUTING, 2022, 506 : 265 - 276
  • [10] Image Captioning With End-to-End Attribute Detection and Subsequent Attributes Prediction
    Huang, Yiqing
    Chen, Jiansheng
    Ouyang, Wanli
    Wan, Weitao
    Xue, Youze
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4013 - 4026