Automatic image captioning combining natural language processing and deep neural networks

被引:10
|
作者
Rinaldi, Antonio M. [1 ]
Russo, Cristiano [1 ]
Tommasino, Cristian [1 ]
机构
[1] Univ Naples Federico II, Dept Elect Engn & Informat Technol, IKNOS LAB Intelligent & Knowledge Syst LUPT, Via Claudio 21, I-80125 Naples, Italy
关键词
Object detection; Image captioning; Deep neural networks; Semantic-instance segmentation;
D O I
10.1016/j.rineng.2023.101107
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An image contains a lot of information that humans can detect in a very short time. Image captioning aims to detect this information by describing the image content through image and text processing techniques. One of the peculiarities of the proposed approach is the combination of multiple networks to catch as many distinct features as possible from a semantic point of view. In this work, our goal is to prove that a combination strategy of existing methods can efficiently improve the performance in the object detection tasks concerning the performance achieved by each tested individually. This approach involves using different deep neural networks that perform two levels of hierarchical object detection in an image. The results are combined and used by a captioning module that generates image captions through natural language processing techniques. Several experimental results are reported and discussed to show the effectiveness of our framework. The combination strategy has also improved, showing a gain in precision over single models.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] A novel deep fuzzy neural network semantic-enhanced method for automatic image captioning
    Tham Vo
    Soft Computing, 2023, 27 : 14647 - 14658
  • [32] Face Image Based Automatic Diagnosis by Deep Neural Networks
    Niu, Lulu
    Xiong, Gang
    Shen, Zhen
    Pan, Zhouxian
    Chen, Shi
    Dong, Xisong
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1352 - 1357
  • [33] Image captioning in Hindi language using transformer networks
    Mishra, Santosh Kumar
    Dhir, Rijul
    Saha, Sriparna
    Bhattacharyya, Pushpak
    Singh, Amit Kumar
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 92
  • [34] Automatic derivation of programs for image processing from natural language descriptions
    Ren, F
    Zaima, Y
    PARALLEL AND DISTRIBUTED METHODS FOR IMAGE PROCESSING III, 1999, 3817 : 62 - 73
  • [35] Relaxation Method of Convolutional Neural Networks for Natural Language Processing
    Iwasaki, Ryo
    Hasegawa, Taku
    Mori, Naoki
    Matsumoto, Keinosuke
    DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2019, 800 : 188 - 195
  • [36] Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
    Tay, Yi
    Zhang, Aston
    Tuan, Luu Anh
    Rao, Jinfeng
    Zhang, Shuai
    Wang, Shuohang
    Fu, Jie
    Hui, Siu Cheung
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1494 - 1503
  • [37] Automatic image captioning system using a deep learning approach
    Deepak, Gerard
    Gali, Sowmya
    Sonker, Abhilash
    Jos, Bobin Cherian
    Sagar, K. V. Daya
    Singh, Charanjeet
    SOFT COMPUTING, 2023,
  • [38] Natural Image Matting Using Deep Convolutional Neural Networks
    Cho, Donghyeon
    Tai, Yu-Wing
    Kweon, Inso
    COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 626 - 643
  • [39] Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition
    Sainath, Tara N.
    Weiss, Ron J.
    Wilson, Kevin W.
    Li, Bo
    Narayanan, Arun
    Variani, Ehsan
    Bacchiani, Michiel
    Shafran, Izhak
    Senior, Andrew
    Chin, Kean
    Misra, Ananya
    Kim, Chanwoo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (05) : 965 - 979
  • [40] Image Captioning using Convolutional Neural Networks and Recurrent Neural Network
    Calvin, Rachel
    Suresh, Shravya
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,