Arabic Captioning for Images of Clothing Using Deep Learning

被引:2
|
作者
Al-Malki, Rasha Saleh [1 ]
Al-Aama, Arwa Yousuf [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Comp Sci Dept, Jeddah 21589, Saudi Arabia
关键词
deep learning; image captioning; transfer learning; image attributes;
D O I
10.3390/s23083783
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Fashion is one of the many fields of application that image captioning is being used in. For e-commerce websites holding tens of thousands of images of clothing, automated item descriptions are quite desirable. This paper addresses captioning images of clothing in the Arabic language using deep learning. Image captioning systems are based on Computer Vision and Natural Language Processing techniques because visual and textual understanding is needed for these systems. Many approaches have been proposed to build such systems. The most widely used methods are deep learning methods which use the image model to analyze the visual content of the image, and the language model to generate the caption. Generating the caption in the English language using deep learning algorithms received great attention from many researchers in their research, but there is still a gap in generating the caption in the Arabic language because public datasets are often not available in the Arabic language. In this work, we created an Arabic dataset for captioning images of clothing which we named "ArabicFashionData" because this model is the first model for captioning images of clothing in the Arabic language. Moreover, we classified the attributes of the images of clothing and used them as inputs to the decoder of our image captioning model to enhance Arabic caption quality. In addition, we used the attention mechanism. Our approach achieved a BLEU-1 score of 88.52. The experiment findings are encouraging and suggest that, with a bigger dataset, the attributes-based image captioning model can achieve excellent results for Arabic image captioning.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Arabic Handwritten Recognition Using Deep Learning: A Survey
    Naseem Alrobah
    Saleh Albahli
    Arabian Journal for Science and Engineering, 2022, 47 : 9943 - 9963
  • [42] Arabic Text Classification Using Deep Learning Technics
    Boukil, Samir
    Biniz, Mohamed
    El Adnani, Fatiha
    Cherrat, Loubna
    El Moutaouakkil, Abd Elmaj Id
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2018, 11 (09): : 103 - 114
  • [43] Scene captioning with deep fusion of images and point clouds
    Yu, Qiang
    Zhang, Chunxia
    Weng, Lubin
    Xiang, Shiming
    Pan, Chunhong
    PATTERN RECOGNITION LETTERS, 2022, 158 : 9 - 15
  • [44] Captioning Images Using Different Styles
    Mathews, Alexander
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 665 - 668
  • [45] Extracting Worker Unsafe Behaviors from Construction Images Using Image Captioning with Deep Learning-Based Attention Mechanism
    Zhai, Peichen
    Wang, Junjie
    Zhang, Lite
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2023, 149 (02)
  • [46] Deep Learning Approaches on Image Captioning: A Review
    Ghandi, Taraneh
    Pourreza, Hamidreza
    Mahyar, Hamidreza
    ACM COMPUTING SURVEYS, 2024, 56 (03)
  • [47] A Comprehensive Survey of Deep Learning for Image Captioning
    Hossain, Md Zakir
    Sohel, Ferdous
    Shiratuddin, Mohd Fairuz
    Laga, Hamid
    ACM COMPUTING SURVEYS, 2019, 51 (06)
  • [48] Facilitated Deep Learning Models for Image Captioning
    Azhar, Imtinan
    Afyouni, Imad
    Elnagar, Ashraf
    2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
  • [49] Learning deep spatiotemporal features for video captioning
    Daskalakis, Eleftherios
    Tzelepi, Maria
    Tefas, Anastasios
    PATTERN RECOGNITION LETTERS, 2018, 116 : 143 - 149
  • [50] A Robust Model for Translating Arabic Sign Language into Spoken Arabic Using Deep Learning
    Nahar, Khalid M. O.
    Almomani, Ammar
    Shatnawi, Nahlah
    Alauthman, Mohammad
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 2037 - 2057