Image captioning with data augmentation using cropping and mask based on attention image

被引:0
|
作者
Iwamura K.
Louhi Kasahara J.Y.
Moro A.
Yamashita A.
Asama H.
机构
关键词
Attention; Cropping; Data augmentation; Deep learning; Mask;
D O I
10.2493/jjspe.86.904
中图分类号
学科分类号
摘要
Automatic image captioning has various important applications such as the depiction of contents for the visually impaired. Most approaches use Deep Learning and have achieved remarkable results. However there are still some unresolved issues. One of them is the overfilling of the trained model to specific images, usually caused by limited training dataset sizes. In order to augment the training dataset size in such scenarios, previous researches proposed data augmentation using random cropping or mask. However, those do not specifically target overfitted regions in images and, therefore, may remove areas in images that are needed to generate captions and lower performance. In this study, we propose a novel data augmentation method that targets specifically regions in images subject to overfitting by using attention. Experimental results show that the proposed method allows generation of better image captions. © 2020 Japan Society for Precision Engineering. All rights reserved.
引用
收藏
页码:904 / 910
页数:6
相关论文
共 50 条
  • [21] Image Captioning with Text-Based Visual Attention
    Chen He
    Haifeng Hu
    Neural Processing Letters, 2019, 49 : 177 - 185
  • [22] A New Attention-Based LSTM for Image Captioning
    Xiao, Fen
    Xue, Wenfeng
    Shen, Yanqing
    Gao, Xieping
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3157 - 3171
  • [23] Image Captioning with Text-Based Visual Attention
    He, Chen
    Hu, Haifeng
    NEURAL PROCESSING LETTERS, 2019, 49 (01) : 177 - 185
  • [24] An Image Captioning Algorithm Based on Combination Attention Mechanism
    Liu, Jinlong
    Cheng, Kangda
    Jin, Haiyan
    Wu, Zhilu
    ELECTRONICS, 2022, 11 (09)
  • [25] Panoptic Segmentation-Based Attention for Image Captioning
    Cai, Wenjie
    Xiong, Zheng
    Sun, Xianfang
    Rosin, Paul L.
    Jin, Longcun
    Peng, Xinyi
    APPLIED SCIENCES-BASEL, 2020, 10 (01):
  • [26] Reference Based on Adaptive Attention Mechanism for Image Captioning
    Liu, Shuang
    Bai, Liang
    Guo, Yanming
    Wang, Haoran
    2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [27] IMAGE CAPTIONING MODEL BASED ON MULTIPLE ATTENTION PATTERNS
    Zhang, Tao
    Zhang, Tingting
    Ma, Feng
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2024, 25 (01) : 191 - 206
  • [28] Research on Image Captioning Based on Double Attention Model
    Zhuo Y.-Q.
    Wei J.-H.
    Li Z.-X.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (05): : 1123 - 1130
  • [29] Automatic image captioning system based on augmentation and ranking mechanism
    B. S. Revathi
    A. Meena Kowshalya
    Signal, Image and Video Processing, 2024, 18 : 265 - 274
  • [30] Automatic image captioning system based on augmentation and ranking mechanism
    Revathi, B. S.
    Kowshalya, A. Meena
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 265 - 274