Image and audio caps: automated captioning of background sounds and images using deep learning

被引:32
|
作者
Poongodi, M. [1 ]
Hamdi, Mounir [1 ]
Wang, Huihui [2 ]
机构
[1] Hamad Bin Khalifa Univ, Coll Sci & Engn, Dept Informat & Comp Technol, Doha, Qatar
[2] St Bonaventure Univ, CyberSecur Program, St Bonaventure, NY 14778 USA
关键词
Computer vision; Image to caption; Scene recognition; Image analysis; Social networks; IEEE-802.11; WLANS; ALGORITHM; CONTENTION; NETWORKING; SYSTEM;
D O I
10.1007/s00530-022-00902-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image recognition based on computers is something human beings have been working on for many years. It is one of the most difficult tasks in the field of computer science, and improvements to this system are made when we speak. In this paper, we propose a methodology to automatically propose an appropriate title and add a specific sound to the image. Two models have been extensively trained and combined to achieve this effect. Sounds are recommended based on the image scene and the headings are generated using a combination of natural language processing and state-of-the-art computer vision models. A Top 5 accuracy of 67% and a Top 1 accuracy of 53% have been achieved. It is also worth mentioning that this is also the first model of its kind to make this forecast.
引用
收藏
页码:2951 / 2959
页数:9
相关论文
共 50 条
  • [1] Image and audio caps: automated captioning of background sounds and images using deep learning
    M. Poongodi
    Mounir Hamdi
    Huihui Wang
    Multimedia Systems, 2023, 29 : 2951 - 2959
  • [2] Image Captioning using Deep Learning
    Jain, Yukti Sanjay
    Dhopeshwar, Tanisha
    Chadha, Supreet Kaur
    Pagire, Vrushali
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021,
  • [3] Image Captioning Using Deep Learning
    Adithya, Paluvayi Veera
    Kalidindi, Mourya Viswanadh
    Swaroop, Nallani Jyothi
    Vishwas, H. N.
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT III, 2024, 2092 : 42 - 58
  • [4] Automated Image Captioning Using Sparrow Search Algorithm With Improved Deep Learning Model
    Arasi, Munya A.
    Alshahrani, Haya Mesfer
    Alruwais, Nuha
    Motwakel, Abdelwahed
    Ahmed, Noura Abdelaziz
    Mohamed, Abdullah
    IEEE ACCESS, 2023, 11 : 104633 - 104642
  • [5] Arabic Captioning for Images of Clothing Using Deep Learning
    Al-Malki, Rasha Saleh
    Al-Aama, Arwa Yousuf
    SENSORS, 2023, 23 (08)
  • [6] Vision to Language: Captioning Images using Deep Learning
    Charu, Shreyasi
    Mishra, S. P.
    Gandhi, Tapan
    2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2020,
  • [7] VIDEO CAPTIONING BASED ON JOINT IMAGE-AUDIO DEEP LEARNING TECHNIQUES
    Wang, Chien-Yao
    Liaw, Pei-Sin
    Liang, Kai-Wen
    Wang, Jai-Ching
    Chang, Pao-Chi
    2019 IEEE 9TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE-BERLIN), 2019, : 127 - 131
  • [8] Metaheuristics Optimization with Deep Learning Enabled Automated Image Captioning System
    Al Duhayyim, Mesfer
    Alazwari, Sana
    Mengash, Hanan Abdullah
    Marzouk, Radwa
    Alzahrani, Jaber S.
    Mahgoub, Hany
    Althukair, Fahd
    Salama, Ahmed S.
    APPLIED SCIENCES-BASEL, 2022, 12 (15):
  • [9] Modeling of Hyperparameter Tuned Deep Learning Model for Automated Image Captioning
    Omri, Mohamed
    Abdel-Khalek, Sayed
    Khalil, Eied M.
    Bouslimi, Jamel
    Joshi, Gyanendra Prasad
    MATHEMATICS, 2022, 10 (03)
  • [10] Image and Video Captioning for Apparels Using Deep Learning
    Agarwal, Govind
    Jindal, Kritika
    Chowdhury, Abishi
    Singh, Vishal K.
    Pal, Amrit
    IEEE ACCESS, 2024, 12 : 113138 - 113150