Domain-specific image captioning: a comprehensive review

被引:1
|
作者
Sharma, Himanshu [1 ]
Padha, Devanand [1 ]
机构
[1] Cent Univ Jammu, Dept Comp Sci & Informat Technol, Jammu 181124, Jammu & Kashmir, India
关键词
Computer vision; Deep learning; Medical image captioning; Natural image captioning; Remote sensing image captioning; AUTOMATIC IMAGE; GENERATION; MODELS; RETRIEVAL; SPEECH;
D O I
10.1007/s13735-024-00328-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An image caption is a sentence summarizing the semantic details of an image. It is a blended application of computer vision and natural language processing. The earlier research addressed this domain using machine learning approaches by modeling image captioning frameworks using hand-engineered feature extraction techniques. With the resurgence of deep-learning approaches, the development of improved and efficient image captioning frameworks is on the rise. Image captioning is witnessing tremendous growth in various domains as medical, remote sensing, security, visual assistance, and multimodal search engines. In this survey, we comprehensively study the image captioning frameworks based on our proposed domain-specific taxonomy. We explore the benchmark datasets and metrics leveraged for training and evaluating image captioning models in various application domains. In addition, we also perform a comparative analysis of the reviewed models. Natural image captioning, medical image captioning, and remote sensing image captioning are currently among the most prominent application domains of image captioning. The efficacy of real-time image captioning is a challenging obstacle limiting its implementation in sensitive areas such as visual aid, remote security, and healthcare. Further challenges include the scarcity of rich domain-specific datasets, training complexity, evaluation difficulty, and a deficiency of cross-domain knowledge transfer techniques. Despite the significant contributions made, there is a need for additional efforts to develop steadfast and influential image captioning models.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Tutorials in domain-specific acquisition
    BastienToniazzo, M
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1997, 32 (03) : 129 - 138
  • [42] Designing domain-specific processors
    Arnold, M
    Corporaal, H
    PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2001, : 61 - 66
  • [43] Unembedding Domain-Specific Languages
    Atkey, Robert
    Lindley, Sam
    Yallop, Jeremy
    HASKELL'09: PROCEEDINGS OF THE 2009 ACM SIGPLAN HASKELL SYMPOSIUM, 2009, : 37 - 48
  • [44] Domain-Specific Paraphrase Extraction
    Pavlick, Ellie
    Ganitkevitch, Juri
    Chan, Tsz Ping
    Yao, Xuchen
    Van Durme, Benjamin
    Callison-Burch, Chris
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 57 - 62
  • [45] Are there domain-specific thinking skills?
    Smith, G
    JOURNAL OF PHILOSOPHY OF EDUCATION, 2002, 36 (02) : 207 - 227
  • [46] Exploring Domain-Specific Perfectionism
    McArdle, Siobhain
    JOURNAL OF PERSONALITY, 2010, 78 (02) : 493 - 508
  • [47] A domain-specific software architecture
    Geng, GY
    Zhong, CH
    Chen, W
    1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT PROCESSING SYSTEMS, VOLS 1 & 2, 1997, : 1833 - 1837
  • [48] Domain-specific ontology of botany
    Gu, F
    Cao, CG
    Sui, YF
    Wen, TA
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2004, 19 (02) : 238 - 248
  • [49] Macros for Domain-Specific Languages
    Ballantyne, Michael
    King, Alexis
    Felleisen, Matthias
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (OOPSLA):
  • [50] HYBRID DOMAIN-SPECIFIC KITS
    GRISS, ML
    WENTZEL, KD
    JOURNAL OF SYSTEMS AND SOFTWARE, 1995, 30 (03) : 213 - 230