IMAGE CAPTIONING WITH ATTRIBUTE REFINEMENT

被引：0

作者：

Huang, Yiqing ^{[1
]}

Li, Cong ^{[1
]}

Li, Tianpeng ^{[1
]}

Wan, Weitao ^{[1
]}

Chen, Jiansheng ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing 100084, Peoples R China

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年

基金：

中国国家自然科学基金;

关键词：

Image captioning; attribute recognition; Semantic attention; Deep Neural Network; Conditional Random Field;

D O I：

10.1109/icip.2019.8803108

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

Semantic attention has long been adopted to image captioning models to enhance the image captioning performances. The models pre-trained for attribute recognition are utilized to generate image attributes in image captioning. Generally, these models are not jointly trained with image captioning models. In this paper, we propose attribute refinement network, which incorporates attribute recognition with image captioning to boost the performance on both tasks. We model the correlation between attributes with the semantic information from image captioning to improve the recognition accuracy. In turn, better attribute recognition results effectively enhance image captioning performance. Our model achieves CIDEr-D/SPICE scores of 115.1 and 20.9 respectively on the MS COCO test set, comprehensively yields improvement over all compared methods.

引用

页码：1820 / 1824

页数：5

共 50 条

[21] COLLOQUIAL IMAGE CAPTIONING
Ge, Xuri
Chen, Fuhai
Shen, Chen
Ji, Rongrong
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 356 - 361
[22] Automated image captioning
Puscasiu, Adela
Fanca, Alexandra
Gota, Dan-Ioan
Valean, Honoriu
PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR), 2020, : 361 - 366
[23] Automatic image captioning
Pan, JY
Yang, HJ
Duygulu, P
Faloutsos, C
2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1987 - 1990
[24] Image/video captioning
Ushiku Y.
Ushiku, Yoshitaka, 2018, Inst. of Image Information and Television Engineers (72): : 650 - 654
[25] GAF-Net: Global view guided attribute fusion network for remote sensing image captioning
Peng, Yuqing
Jia, Yamin
Chen, Jiao
Ji, Xinhao
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 22409 - 22431
[26] GAF-Net: Global view guided attribute fusion network for remote sensing image captioning
Yuqing Peng
Yamin Jia
Jiao Chen
Xinhao Ji
Multimedia Tools and Applications, 2024, 83 : 22409 - 22431
[27] Attribute-Driven Filtering: A new attributes predicting approach for fine-grained image captioning
Hossen, Md. Bipul
Ye, Zhongfu
Abdussalam, Amr
Ul Hassan, Shabih
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
[28] Video Captioning based on Image Captioning as Subsidiary Content
Vaishnavi, J.
Narmatha, V
2022 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL, COMPUTING, COMMUNICATION AND SUSTAINABLE TECHNOLOGIES (ICAECT), 2022,
[29] Hierarchy Parsing for Image Captioning
Yao, Ting
Pan, Yingwei
Li, Yehao
Mei, Tao
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2621 - 2629
[30] Attention on Attention for Image Captioning
Huang, Lun
Wang, Wenmin
Chen, Jie
Wei, Xiao-Yong
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4633 - 4642

← 1 2 3 4 5 →