ICEAP: An advanced fine-grained image captioning network with enhanced attribute predictor

被引:3
|
作者
Hossen, Md. Bipul [1 ]
Ye, Zhongfu [1 ]
Abdussalam, Amr [1 ]
Hossain, Mohammad Alamgir [1 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Anhui, Peoples R China
关键词
Fine-grained image caption; Attention mechanism; Encoder-decoder; Independent attribute predictor; Enhanced attribute predictor;
D O I
10.1016/j.displa.2024.102798
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-grained image captioning is a focal point in the vision-to-language task and has attracted considerable attention for generating accurate and contextually relevant image captions. Effective attribute prediction and their utilization play a crucial role in enhancing image captioning performance. Despite progress in prior attribute-related methods, they either focus on predicting attributes related to the input image or concentrate on predicting linguistic context-related attributes at each time step in the language model. However, these approaches often overlook the importance of balancing visual and linguistic contexts, leading to ineffective exploitation of semantic information and a subsequent decline in performance. To address these issues, an Independent Attribute Predictor (IAP) is introduced to precisely predict attributes related to the input image by leveraging relationships between visual objects and attribute embeddings. Following this, an Enhanced Attribute Predictor (EAP) is proposed, initially predicting linguistic context-related attributes and then using prior probabilities from the IAP module to rebalance image and linguistic context-related attributes, thereby generating more robust and enhanced attribute probabilities. These refined attributes are then integrated into the language LSTM layer to ensure accurate word prediction at each time step. The integration of the IAP and EAP modules in our proposed image captioning with the enhanced attribute predictor (ICEAP) model effectively incorporates high-level semantic details, enhancing overall model performance. The ICEAP outperforms contemporary models, yielding significant average improvements of 10.62% in CIDEr-D scores for MS-COCO, 9.63% for Flickr30K and 7.74% for Flickr8K datasets using cross-entropy optimization, with qualitative analysis confirming its ability to generate fine-grained captions.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Personalized Image Aesthetics Assessment with Attribute-guided Fine-grained Feature Representation
    Zhu, Hancheng
    Shao, Zhiwen
    Zhou, Yong
    Wang, Guangcheng
    Chen, Pengfei
    Li, Leida
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6794 - 6802
  • [42] Learning enhanced features and inferring twice for fine-grained image classification
    Nie, Xuan
    Chai, Bosong
    Wang, Luyao
    Liao, Qiyu
    Xu, Min
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (10) : 14799 - 14813
  • [43] Fine-Grained Secure Attribute-Based Encryption
    Wang, Yuyu
    Pan, Jiaxin
    Chen, Yu
    JOURNAL OF CRYPTOLOGY, 2023, 36 (04)
  • [44] Lane Attribute Classification Based on Fine-Grained Description
    He, Zhonghe
    Gong, Pengfei
    Ye, Hongcheng
    Gan, Zizheng
    SENSORS, 2024, 24 (15)
  • [45] Fine-Grained Recommendation Systems for Service Attribute Exchange
    Staite, Christopher
    Bahsoon, Rami
    Wolak, Stephen
    SERVICE-ORIENTED COMPUTING - ICSOC 2009, PROCEEDINGS, 2009, 5900 : 352 - +
  • [46] Learning enhanced features and inferring twice for fine-grained image classification
    Xuan Nie
    Bosong Chai
    Luyao Wang
    Qiyu Liao
    Min Xu
    Multimedia Tools and Applications, 2023, 82 : 14799 - 14813
  • [47] A Method of Pedestrian Fine-grained Attribute Detection and Recognition
    Ma, Xianqin
    Yu, Chongchong
    Yang, Xin
    Chen, Xiuxin
    Chen, Jianzhang
    Zhou, Lan
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [48] Fine-Grained Secure Attribute-Based Encryption
    Yuyu Wang
    Jiaxin Pan
    Yu Chen
    Journal of Cryptology, 2023, 36
  • [49] Fine-grained Action Recognition using Attribute Vectors
    Yenduri, Sravani
    Perveen, Nazil
    Chalavadi, Vishnu
    Mohan, C. Krishna
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 134 - 143
  • [50] Fine-Grained Secure Attribute-Based Encryption
    Wang, Yuyu
    Pan, Jiaxin
    Chen, Yu
    ADVANCES IN CRYPTOLOGY - CRYPTO 2021, PT IV, 2021, 12828 : 179 - 207