Domain-specific image captioning: a comprehensive review

被引：1

作者：

Sharma, Himanshu ^{[1
]}

Padha, Devanand ^{[1
]}

机构：

[1] Cent Univ Jammu, Dept Comp Sci & Informat Technol, Jammu 181124, Jammu & Kashmir, India

来源：

INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL | 2024年 / 13卷 / 02期

关键词：

Computer vision; Deep learning; Medical image captioning; Natural image captioning; Remote sensing image captioning; AUTOMATIC IMAGE; GENERATION; MODELS; RETRIEVAL; SPEECH;

D O I：

10.1007/s13735-024-00328-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An image caption is a sentence summarizing the semantic details of an image. It is a blended application of computer vision and natural language processing. The earlier research addressed this domain using machine learning approaches by modeling image captioning frameworks using hand-engineered feature extraction techniques. With the resurgence of deep-learning approaches, the development of improved and efficient image captioning frameworks is on the rise. Image captioning is witnessing tremendous growth in various domains as medical, remote sensing, security, visual assistance, and multimodal search engines. In this survey, we comprehensively study the image captioning frameworks based on our proposed domain-specific taxonomy. We explore the benchmark datasets and metrics leveraged for training and evaluating image captioning models in various application domains. In addition, we also perform a comparative analysis of the reviewed models. Natural image captioning, medical image captioning, and remote sensing image captioning are currently among the most prominent application domains of image captioning. The efficacy of real-time image captioning is a challenging obstacle limiting its implementation in sensitive areas such as visual aid, remote security, and healthcare. Further challenges include the scarcity of rich domain-specific datasets, training complexity, evaluation difficulty, and a deficiency of cross-domain knowledge transfer techniques. Despite the significant contributions made, there is a need for additional efforts to develop steadfast and influential image captioning models.

引用

页数：27

共 50 条

[31] Scalable Document Image Information Extraction with Application to Domain-Specific Analysis
Zheng, Yingbin
Kong, Shuchen
Zhu, Wanshan
Ye, Hao
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5108 - 5115
[32] Domain-specific risk attitudes and aging-A systematic review
Koenig, Adriana N.
JOURNAL OF BEHAVIORAL DECISION MAKING, 2021, 34 (03) : 359 - 378
[33] Integrated Analog Computers as Domain-Specific Accelerators: A Tutorial Review
Mandal, Soumyajit
Liang, Jifu
Malavipathirana, Hasantha
Udayanga, Nilan
Silva, Hiruni
Hariharan, S., I
Madanayake, Arjuna
2024 IEEE 67TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, MWSCAS 2024, 2024, : 875 - 881
[34] A Comprehensive Survey of Deep Learning for Image Captioning
Hossain, Md Zakir
Sohel, Ferdous
Shiratuddin, Mohd Fairuz
Laga, Hamid
ACM COMPUTING SURVEYS, 2019, 51 (06)
[35] On domain-specific languages reengineering
Alias, C
Barthou, D
GENERATIVE PROGRAMMING AND COMPONENT ENGINEERING, PROCEEDINGS, 2005, 3676 : 63 - 77
[36] Domain-specific regular acceleration
Bernard Boigelot
Boigelot, B. (boigelot@montefiore.ulg.ac.be), 1600, Springer Verlag (14): : 193 - 206
[37] Domain-specific keyphrase extraction
Frank, E
Paynter, GW
Witten, IH
Gutwin, C
Nevill-Manning, CG
IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 668 - 673
[38] A domain-specific modeling milestone
Jeff Gray
Bernhard Rumpe
Juha-Pekka Tolvanen
Software and Systems Modeling, 2021, 20 : 917 - 918
[39] Democratizing Domain-Specific Computing
Chi, Yuze
Qiao, Weikang
Sohrabizadeh, Atefeh
Wang, Jie
Cong, Jason
COMMUNICATIONS OF THE ACM, 2023, 66 (01) : 74 - 85
[40] Domain-specific Event Abstraction
Klessascheck, Finn
Lichtenstein, Tom
Meier, Martin
Remy, Simon
Sachs, Jan Philipp
Pufahl, Luise
Miotto, Riccardo
Boettinger, Erwin
Weske, Mathias
24TH INTERNATIONAL CONFERENCE ON BUSINESS INFORMATION SYSTEMS (BIS): ENTERPRISE KNOWLEDGE AND DATA SPACES, 2021, : 117 - 126

← 1 2 3 4 5 →