Domain-specific image captioning: a comprehensive review

被引:1
|
作者
Sharma, Himanshu [1 ]
Padha, Devanand [1 ]
机构
[1] Cent Univ Jammu, Dept Comp Sci & Informat Technol, Jammu 181124, Jammu & Kashmir, India
关键词
Computer vision; Deep learning; Medical image captioning; Natural image captioning; Remote sensing image captioning; AUTOMATIC IMAGE; GENERATION; MODELS; RETRIEVAL; SPEECH;
D O I
10.1007/s13735-024-00328-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An image caption is a sentence summarizing the semantic details of an image. It is a blended application of computer vision and natural language processing. The earlier research addressed this domain using machine learning approaches by modeling image captioning frameworks using hand-engineered feature extraction techniques. With the resurgence of deep-learning approaches, the development of improved and efficient image captioning frameworks is on the rise. Image captioning is witnessing tremendous growth in various domains as medical, remote sensing, security, visual assistance, and multimodal search engines. In this survey, we comprehensively study the image captioning frameworks based on our proposed domain-specific taxonomy. We explore the benchmark datasets and metrics leveraged for training and evaluating image captioning models in various application domains. In addition, we also perform a comparative analysis of the reviewed models. Natural image captioning, medical image captioning, and remote sensing image captioning are currently among the most prominent application domains of image captioning. The efficacy of real-time image captioning is a challenging obstacle limiting its implementation in sensitive areas such as visual aid, remote security, and healthcare. Further challenges include the scarcity of rich domain-specific datasets, training complexity, evaluation difficulty, and a deficiency of cross-domain knowledge transfer techniques. Despite the significant contributions made, there is a need for additional efforts to develop steadfast and influential image captioning models.
引用
收藏
页数:27
相关论文
共 50 条
  • [31] Scalable Document Image Information Extraction with Application to Domain-Specific Analysis
    Zheng, Yingbin
    Kong, Shuchen
    Zhu, Wanshan
    Ye, Hao
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5108 - 5115
  • [32] Domain-specific risk attitudes and aging-A systematic review
    Koenig, Adriana N.
    JOURNAL OF BEHAVIORAL DECISION MAKING, 2021, 34 (03) : 359 - 378
  • [33] Integrated Analog Computers as Domain-Specific Accelerators: A Tutorial Review
    Mandal, Soumyajit
    Liang, Jifu
    Malavipathirana, Hasantha
    Udayanga, Nilan
    Silva, Hiruni
    Hariharan, S., I
    Madanayake, Arjuna
    2024 IEEE 67TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, MWSCAS 2024, 2024, : 875 - 881
  • [34] A Comprehensive Survey of Deep Learning for Image Captioning
    Hossain, Md Zakir
    Sohel, Ferdous
    Shiratuddin, Mohd Fairuz
    Laga, Hamid
    ACM COMPUTING SURVEYS, 2019, 51 (06)
  • [35] On domain-specific languages reengineering
    Alias, C
    Barthou, D
    GENERATIVE PROGRAMMING AND COMPONENT ENGINEERING, PROCEEDINGS, 2005, 3676 : 63 - 77
  • [36] Domain-specific regular acceleration
    Bernard Boigelot
    Boigelot, B. (boigelot@montefiore.ulg.ac.be), 1600, Springer Verlag (14): : 193 - 206
  • [37] Domain-specific keyphrase extraction
    Frank, E
    Paynter, GW
    Witten, IH
    Gutwin, C
    Nevill-Manning, CG
    IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 668 - 673
  • [38] A domain-specific modeling milestone
    Jeff Gray
    Bernhard Rumpe
    Juha-Pekka Tolvanen
    Software and Systems Modeling, 2021, 20 : 917 - 918
  • [39] Democratizing Domain-Specific Computing
    Chi, Yuze
    Qiao, Weikang
    Sohrabizadeh, Atefeh
    Wang, Jie
    Cong, Jason
    COMMUNICATIONS OF THE ACM, 2023, 66 (01) : 74 - 85
  • [40] Domain-specific Event Abstraction
    Klessascheck, Finn
    Lichtenstein, Tom
    Meier, Martin
    Remy, Simon
    Sachs, Jan Philipp
    Pufahl, Luise
    Miotto, Riccardo
    Boettinger, Erwin
    Weske, Mathias
    24TH INTERNATIONAL CONFERENCE ON BUSINESS INFORMATION SYSTEMS (BIS): ENTERPRISE KNOWLEDGE AND DATA SPACES, 2021, : 117 - 126