c-RNN: A Fine-Grained Language Model for Image Captioning

被引:9
|
作者
Huang, Gengshi [1 ]
Hu, Haifeng [1 ]
机构
[1] Sun Yat Sen Univ, Sch Elect & Informat Engn, Guangzhou 510006, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Image captioning; Character-level; Convolutional Neural Network; Recurrent Neural Network; Sequence learning;
D O I
10.1007/s11063-018-9836-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Captioning methods from predecessors that based on the conventional deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) architecture follow translation system using word-level modelling. But an optimal word segmentation algorithm is essential for segmenting sentence into words in word-level modelling, which is a very difficult task. In this paper, we built a character-level RNN (c-RNN) that directly modeled on captions with characterization where descriptive sentence is composed in a flow of characters. The c-RNN performs language task in finer level and naturally avoids the word segmentation issue. Our c-RNN empowered the language model to dynamically reason about word spelling as well as grammatical rules which results in expressive and elaborate sentence. We optimized parameters of neural nets by maximizing the probabilities of correctly generated characterized sentences. Quantitative and qualitative experiments on the most popular datasets MSCOCO and Flickr30k showed that our c-RNN could describe images with a considerably faster speed and satisfactory quality.
引用
收藏
页码:683 / 691
页数:9
相关论文
共 50 条
  • [41] Research on the Fine-grained Plant Image Classification
    Hu, Zhifeng
    Zhang, Yin
    Tan, Liang
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2016, 71 : 1307 - 1311
  • [42] Multidimensional interactive fine-grained image retrieval
    Hsiang, J
    Liu, WJ
    Chen, BC
    Tu, HC
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 297 - 300
  • [43] Ultra Fine-Grained Image Semantic Embedding
    Juan, Da-Cheng
    Lu, Chun-To
    Li, Zhen
    Peng, Futang
    Timofeev, Aleksei
    Chen, Yi-Ting
    Gao, Yaxi
    Duerig, Tom
    Tomkins, Andrew
    Ravi, Sujith
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 277 - 285
  • [44] Learning to locate for fine-grained image recognition
    Chen, Jiamin
    Hu, Jianguo
    Li, Shiren
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 206
  • [45] Image Classification With Tailored Fine-Grained Dictionaries
    Shu, Xiangbo
    Tang, Jinhui
    Qi, Guo-Jun
    Li, Zechao
    Jiang, Yu-Gang
    Yan, Shuicheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (02) : 454 - 467
  • [46] Fine-grained attention for image caption generation
    Chang, Yan-Shuo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (03) : 2959 - 2971
  • [47] Incremental Learning for Fine-Grained Image Recognition
    Cao, Liangliang
    Hsiao, Jenhao
    de Juan, Paloma
    Li, Yuncheng
    Thomee, Bart
    ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 363 - 366
  • [48] Fine-grained attention for image caption generation
    Yan-Shuo Chang
    Multimedia Tools and Applications, 2018, 77 : 2959 - 2971
  • [49] ADVERSARIAL LEARNING FOR FINE-GRAINED IMAGE SEARCH
    Lin, Kevin
    Yang, Fan
    Wang, Qiaosong
    Piramuthu, Robinson
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 490 - 495
  • [50] Fine-Grained Retrieval Method of Textile Image
    Tan, Shutao
    Dong, Liang
    Zhang, Min
    Zhang, Ye
    IEEE ACCESS, 2023, 11 : 70525 - 70533