Domain-specific image captioning: a comprehensive review

被引：1

作者：

Sharma, Himanshu ^{[1
]}

Padha, Devanand ^{[1
]}

机构：

[1] Cent Univ Jammu, Dept Comp Sci & Informat Technol, Jammu 181124, Jammu & Kashmir, India

来源：

INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL | 2024年 / 13卷 / 02期

关键词：

Computer vision; Deep learning; Medical image captioning; Natural image captioning; Remote sensing image captioning; AUTOMATIC IMAGE; GENERATION; MODELS; RETRIEVAL; SPEECH;

D O I：

10.1007/s13735-024-00328-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An image caption is a sentence summarizing the semantic details of an image. It is a blended application of computer vision and natural language processing. The earlier research addressed this domain using machine learning approaches by modeling image captioning frameworks using hand-engineered feature extraction techniques. With the resurgence of deep-learning approaches, the development of improved and efficient image captioning frameworks is on the rise. Image captioning is witnessing tremendous growth in various domains as medical, remote sensing, security, visual assistance, and multimodal search engines. In this survey, we comprehensively study the image captioning frameworks based on our proposed domain-specific taxonomy. We explore the benchmark datasets and metrics leveraged for training and evaluating image captioning models in various application domains. In addition, we also perform a comparative analysis of the reviewed models. Natural image captioning, medical image captioning, and remote sensing image captioning are currently among the most prominent application domains of image captioning. The efficacy of real-time image captioning is a challenging obstacle limiting its implementation in sensitive areas such as visual aid, remote security, and healthcare. Further challenges include the scarcity of rich domain-specific datasets, training complexity, evaluation difficulty, and a deficiency of cross-domain knowledge transfer techniques. Despite the significant contributions made, there is a need for additional efforts to develop steadfast and influential image captioning models.

引用

页数：27

共 50 条

[1] Domain-Specific Semantics Guided Approach to Video Captioning
Hemalatha, M.
Sekhar, C. Chandra
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1576 - 1585
[2] Visuals to Text: A Comprehensive Review on Automatic Image Captioning
Yue Ming
Nannan Hu
Chunxiao Fan
Fan Feng
Jiangwan Zhou
Hui Yu
IEEE/CAAJournalofAutomaticaSinica, 2022, 9 (08) : 1339 - 1365
[3] Visuals to Text: A Comprehensive Review on Automatic Image Captioning
Ming, Yue
Hu, Nannan
Fan, Chunxiao
Feng, Fan
Zhou, Jiangwan
Yu, Hui
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (08) : 1339 - 1365
[4] Domain-Specific Optimisations for Image Processing on FPGAs
Ali, Teymoor
Bhowmik, Deepayan
Nicol, Robert
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2023, 95 (10): : 1167 - 1179
[5] Domain-Specific Optimisations for Image Processing on FPGAs
Teymoor Ali
Deepayan Bhowmik
Robert Nicol
Journal of Signal Processing Systems, 2023, 95 : 1167 - 1179
[6] Domain-Specific Image Caption Generator with Semantic Ontology
Han, Seung-Ho
Choi, Ho-Jin
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 526 - 530
[7] HIPAcc : A Domain-Specific Language and Compiler for Image Processing
Membarth, Richard
Reiche, Oliver
Hannig, Frank
Teich, Juergen
Koerner, Mario
Eckert, Wieland
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (01) : 210 - 224
[8] Domain-specific model differencing for graphical domain-specific languages
Jafarlou, Manouchehr Zadahmad
ACM/IEEE 25TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS 2022 COMPANION, 2022, : 205 - 208
[9] A comprehensive study of domain-specific emoji meanings in sentiment classification
Nader Mahmoudi
Łukasz P. Olech
Paul Docherty
Computational Management Science, 2022, 19 : 159 - 197
[10] A comprehensive study of domain-specific emoji meanings in sentiment classification
Mahmoudi, Nader
Olech, Lukasz P.
Docherty, Paul
COMPUTATIONAL MANAGEMENT SCIENCE, 2022, 19 (02) : 159 - 197

← 1 2 3 4 5 →