IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引:0
|
作者
Huang, Yiqing [1 ]
Li, Cong [1 ]
Li, Tianpeng [1 ]
Wan, Weitao [1 ]
Chen, Jiansheng [1 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年
基金
中国国家自然科学基金;
关键词
Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;
D O I
10.1109/icip.2019.8803108
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.
引用
收藏
页码:1820 / 1824
页数:5
相关论文
共 50 条
  • [31] Distance Transformer for Image Captioning
    Wang, Jiarong
    Lu, Tongwei
    Liu, Xuanxuan
    Yang, Qi
    2021 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION ENGINEERING (RCAE 2021), 2021, : 73 - 76
  • [32] Boosting Image Captioning with Attributes
    Yao, Ting
    Pan, Yingwei
    Li, Yehao
    Qiu, Zhaofan
    Mei, Tao
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4904 - 4912
  • [33] Image Captioning with Memorized Knowledge
    Chen, Hui
    Ding, Guiguang
    Lin, Zijia
    Guo, Yuchen
    Shan, Caifeng
    Han, Jungong
    COGNITIVE COMPUTATION, 2021, 13 (04) : 807 - 820
  • [34] Rotary Transformer for Image Captioning
    Qiu, Yile
    Zhu, Li
    SECOND INTERNATIONAL CONFERENCE ON OPTICS AND IMAGE PROCESSING (ICOIP 2022), 2022, 12328
  • [35] Object Hallucination in Image Captioning
    Rohrbach, Anna
    Hendricks, Lisa Anne
    Burns, Kaylee
    Darrell, Trevor
    Saenko, Kate
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4035 - 4045
  • [36] Entangled Transformer for Image Captioning
    Li, Guang
    Zhu, Linchao
    Liu, Ping
    Yang, Yi
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8927 - 8936
  • [37] Explainability for Medical Image Captioning
    Beddiar, Djamila
    Oussalah, Mourad
    Tapio, Seppanen
    2022 ELEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2022,
  • [38] Contrastive Learning for Image Captioning
    Dai, Bo
    Lin, Dahua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [39] Image Captioning with Relational Knowledge
    Yang, Huan
    Song, Dandan
    Liao, Lejian
    PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2018, 11013 : 378 - 386
  • [40] Image Captioning for Thai Cultures
    Watcharabutsarakham, Sarin
    Marukatat, Sanparith
    Kiratiratanapruk, Kantip
    Temniranrat, Pitchayagan
    2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,