Image and audio caps: automated captioning of background sounds and images using deep learning

被引：32

作者：

Poongodi, M. ^{[1
]}

Hamdi, Mounir ^{[1
]}

Wang, Huihui ^{[2
]}

机构：

[1] Hamad Bin Khalifa Univ, Coll Sci & Engn, Dept Informat & Comp Technol, Doha, Qatar

[2] St Bonaventure Univ, CyberSecur Program, St Bonaventure, NY 14778 USA

来源：

MULTIMEDIA SYSTEMS | 2023年 / 29卷 / 05期

关键词：

Computer vision; Image to caption; Scene recognition; Image analysis; Social networks; IEEE-802.11; WLANS; ALGORITHM; CONTENTION; NETWORKING; SYSTEM;

D O I：

10.1007/s00530-022-00902-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image recognition based on computers is something human beings have been working on for many years. It is one of the most difficult tasks in the field of computer science, and improvements to this system are made when we speak. In this paper, we propose a methodology to automatically propose an appropriate title and add a specific sound to the image. Two models have been extensively trained and combined to achieve this effect. Sounds are recommended based on the image scene and the headings are generated using a combination of natural language processing and state-of-the-art computer vision models. A Top 5 accuracy of 67% and a Top 1 accuracy of 53% have been achieved. It is also worth mentioning that this is also the first model of its kind to make this forecast.

引用

页码：2951 / 2959

页数：9

共 50 条

[1] Image and audio caps: automated captioning of background sounds and images using deep learning
M. Poongodi
Mounir Hamdi
Huihui Wang
Multimedia Systems, 2023, 29 : 2951 - 2959
[2] Image Captioning using Deep Learning
Jain, Yukti Sanjay
Dhopeshwar, Tanisha
Chadha, Supreet Kaur
Pagire, Vrushali
2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021,
[3] Image Captioning Using Deep Learning
Adithya, Paluvayi Veera
Kalidindi, Mourya Viswanadh
Swaroop, Nallani Jyothi
Vishwas, H. N.
ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT III, 2024, 2092 : 42 - 58
[4] Automated Image Captioning Using Sparrow Search Algorithm With Improved Deep Learning Model
Arasi, Munya A.
Alshahrani, Haya Mesfer
Alruwais, Nuha
Motwakel, Abdelwahed
Ahmed, Noura Abdelaziz
Mohamed, Abdullah
IEEE ACCESS, 2023, 11 : 104633 - 104642
[5] Arabic Captioning for Images of Clothing Using Deep Learning
Al-Malki, Rasha Saleh
Al-Aama, Arwa Yousuf
SENSORS, 2023, 23 (08)
[6] Vision to Language: Captioning Images using Deep Learning
Charu, Shreyasi
Mishra, S. P.
Gandhi, Tapan
2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2020,
[7] VIDEO CAPTIONING BASED ON JOINT IMAGE-AUDIO DEEP LEARNING TECHNIQUES
Wang, Chien-Yao
Liaw, Pei-Sin
Liang, Kai-Wen
Wang, Jai-Ching
Chang, Pao-Chi
2019 IEEE 9TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE-BERLIN), 2019, : 127 - 131
[8] Metaheuristics Optimization with Deep Learning Enabled Automated Image Captioning System
Al Duhayyim, Mesfer
Alazwari, Sana
Mengash, Hanan Abdullah
Marzouk, Radwa
Alzahrani, Jaber S.
Mahgoub, Hany
Althukair, Fahd
Salama, Ahmed S.
APPLIED SCIENCES-BASEL, 2022, 12 (15):
[9] Modeling of Hyperparameter Tuned Deep Learning Model for Automated Image Captioning
Omri, Mohamed
Abdel-Khalek, Sayed
Khalil, Eied M.
Bouslimi, Jamel
Joshi, Gyanendra Prasad
MATHEMATICS, 2022, 10 (03)
[10] Image and Video Captioning for Apparels Using Deep Learning
Agarwal, Govind
Jindal, Kritika
Chowdhury, Abishi
Singh, Vishal K.
Pal, Amrit
IEEE ACCESS, 2024, 12 : 113138 - 113150

← 1 2 3 4 5 →