IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引：0

作者：

Huang, Yiqing ^{[1
]}

Li, Cong ^{[1
]}

Li, Tianpeng ^{[1
]}

Wan, Weitao ^{[1
]}

Chen, Jiansheng ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing 100084, Peoples R China

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年

基金：

中国国家自然科学基金;

关键词：

Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;

D O I：

10.1109/icip.2019.8803108

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.

引用

页码：1820 / 1824

页数：5

共 50 条

[1] ATTRIBUTE CONDITIONED FASHION IMAGE CAPTIONING
Cai, Chen
Yap, Kim-Hui
Wang, Suchen
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1921 - 1925
[2] Distinctive-Attribute Extraction for Image Captioning
Kim, Boeun
Lee, Young Han
Jung, Hyedong
Cho, Choongsang
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 133 - 144
[3] Toward Attribute-Controlled Fashion Image Captioning
Cai, Chen
Yap, Kim-Hui
Wang, Suchen
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (09)
[4] Transformer with token attention and attribute prediction for image captioning
Song, Lifei
Wang, Ying
Shi, Linsu
Yu, Jiazhong
Li, Fei
Xiang, Shiming
PATTERN RECOGNITION LETTERS, 2025, 188 : 74 - 80
[5] Controllable Image Captioning with Feature Refinement and Multilayer Fusion
Du, Sen
Zhu, Hong
Zhang, Yujia
Wang, Dong
Shi, Jing
Xing, Nan
Lin, Guangfeng
Zhou, Huiyu
APPLIED SCIENCES-BASEL, 2023, 13 (08):
[6] Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Wang, Yufei
Lin, Zhe
Shen, Xiaohui
Cohen, Scott
Cottrell, Garrison W.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7378 - 7387
[7] A Text-Guided Generation and Refinement Model for Image Captioning
Wang, Depeng
Hu, Zhenzhen
Zhou, Yuanen
Hong, Richang
Wang, Meng
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2966 - 2977
[8] Feature refinement and rethinking attention for remote sensing image captioning
Li, Yunpeng
Tao, Chengjin
Liu, Meng
Zhang, Xiangrong
Wang, Guanchun
Zhang, Tianyang
Zhao, Dong
Wang, Dabao
SCIENTIFIC REPORTS, 2025, 15 (01):
[9] Attribute assisted teacher-critical training strategies for image captioning
Huang, Yiqing
Chen, Jiansheng
Ma, Huimin
Ma, Hongbing
Ouyang, Wanli
Yu, Cheng
NEUROCOMPUTING, 2022, 506 : 265 - 276
[10] Image Captioning With End-to-End Attribute Detection and Subsequent Attributes Prediction
Huang, Yiqing
Chen, Jiansheng
Ouyang, Wanli
Wan, Weitao
Xue, Youze
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4013 - 4026

← 1 2 3 4 5 →