Domain-specific image captioning: a comprehensive review

被引：1

作者：

Sharma, Himanshu ^{[1
]}

Padha, Devanand ^{[1
]}

机构：

[1] Cent Univ Jammu, Dept Comp Sci & Informat Technol, Jammu 181124, Jammu & Kashmir, India

来源：

INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL | 2024年 / 13卷 / 02期

关键词：

Computer vision; Deep learning; Medical image captioning; Natural image captioning; Remote sensing image captioning; AUTOMATIC IMAGE; GENERATION; MODELS; RETRIEVAL; SPEECH;

D O I：

10.1007/s13735-024-00328-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An image caption is a sentence summarizing the semantic details of an image. It is a blended application of computer vision and natural language processing. The earlier research addressed this domain using machine learning approaches by modeling image captioning frameworks using hand-engineered feature extraction techniques. With the resurgence of deep-learning approaches, the development of improved and efficient image captioning frameworks is on the rise. Image captioning is witnessing tremendous growth in various domains as medical, remote sensing, security, visual assistance, and multimodal search engines. In this survey, we comprehensively study the image captioning frameworks based on our proposed domain-specific taxonomy. We explore the benchmark datasets and metrics leveraged for training and evaluating image captioning models in various application domains. In addition, we also perform a comparative analysis of the reviewed models. Natural image captioning, medical image captioning, and remote sensing image captioning are currently among the most prominent application domains of image captioning. The efficacy of real-time image captioning is a challenging obstacle limiting its implementation in sensitive areas such as visual aid, remote security, and healthcare. Further challenges include the scarcity of rich domain-specific datasets, training complexity, evaluation difficulty, and a deficiency of cross-domain knowledge transfer techniques. Despite the significant contributions made, there is a need for additional efforts to develop steadfast and influential image captioning models.

引用

页数：27

共 50 条

[41] Tutorials in domain-specific acquisition
BastienToniazzo, M
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1997, 32 (03) : 129 - 138
[42] Designing domain-specific processors
Arnold, M
Corporaal, H
PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2001, : 61 - 66
[43] Unembedding Domain-Specific Languages
Atkey, Robert
Lindley, Sam
Yallop, Jeremy
HASKELL'09: PROCEEDINGS OF THE 2009 ACM SIGPLAN HASKELL SYMPOSIUM, 2009, : 37 - 48
[44] Domain-Specific Paraphrase Extraction
Pavlick, Ellie
Ganitkevitch, Juri
Chan, Tsz Ping
Yao, Xuchen
Van Durme, Benjamin
Callison-Burch, Chris
PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 57 - 62
[45] Are there domain-specific thinking skills?
Smith, G
JOURNAL OF PHILOSOPHY OF EDUCATION, 2002, 36 (02) : 207 - 227
[46] Exploring Domain-Specific Perfectionism
McArdle, Siobhain
JOURNAL OF PERSONALITY, 2010, 78 (02) : 493 - 508
[47] A domain-specific software architecture
Geng, GY
Zhong, CH
Chen, W
1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT PROCESSING SYSTEMS, VOLS 1 & 2, 1997, : 1833 - 1837
[48] Domain-specific ontology of botany
Gu, F
Cao, CG
Sui, YF
Wen, TA
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2004, 19 (02) : 238 - 248
[49] Macros for Domain-Specific Languages
Ballantyne, Michael
King, Alexis
Felleisen, Matthias
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (OOPSLA):
[50] HYBRID DOMAIN-SPECIFIC KITS
GRISS, ML
WENTZEL, KD
JOURNAL OF SYSTEMS AND SOFTWARE, 1995, 30 (03) : 213 - 230

← 1 2 3 4 5 →