Image Recall on Image-Text Intertwined Lifelogs

被引:6
|
作者
Chu, Tzu-Hsuan [1 ]
Huang, Hen-Hsen [2 ]
Chen, Hsin-Hsi [2 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Natinal Chengchi Univ, MOST Joint Res Ctr Technol & All Vista Healthcare, Taipei, Taiwan
关键词
lifelogging; image retrieval; multimodal representation;
D O I
10.1145/3350546.3352555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
People engage in lifelogging by taking photos with cameras and cellphones anytime anywhere and share the photos, intertwined with captions or descriptions, on social media platforms. The image-text intertwined data provides richer information for image recall. When images cannot keep the complete information, the textual information is a complement to describe the life experiences under the photos. This work proposes a multimodal retrieval model for image recall in image-text intertwined lifelogs. Our Attentive Image-Story model combines an Image model, which transfers visual information and textual information to a single representation space, and a Story model, which captures text-based contextual information, with an attention mechanism to reduce the semantic gap between visual and textual information. Experimental results show our model outperforms a state-of-the-art image-based retrieval model and the image/text hybrid system.
引用
收藏
页码:398 / 402
页数:5
相关论文
共 50 条
  • [21] Asymmetric Polysemous Reasoning for Image-Text Matching
    Zhang, Hongping
    Yang, Ming
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1013 - 1022
  • [22] HUYSMANS, LEPERE AND 'A REBOURS', AN IMAGE-TEXT INQUIRY
    HASKELL, ET
    WORD & IMAGE, 1988, 4 (01) : 393 - 404
  • [23] ITMix: Image-Text Mix Augmentation for Transferring CLIP to Image Classification
    Hong, Tao
    Guo, Xiangyang
    Ma, Jinwen
    2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 129 - 133
  • [24] Image-Text Dual Model for Small-Sample Image Classification
    Zhu, Fangyi
    Li, Xiaoxu
    Ma, Zhanyu
    Chen, Guang
    Peng, Pai
    Guo, Xiaowei
    Chien, Jen-Tzung
    Guo, Jun
    COMPUTER VISION, PT II, 2017, 772 : 556 - 565
  • [25] Visual Semantic Reasoning for Image-Text Matching
    Li, Kunpeng
    Zhang, Yulun
    Li, Kai
    Li, Yuanyuan
    Fu, Yun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4653 - 4661
  • [26] More Grounded Image Captioning by Distilling Image-Text Matching Model
    Zhou, Yuanen
    Wang, Meng
    Liu, Daqing
    Hu, Zhenzhen
    Zhang, Hanwang
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4776 - 4785
  • [27] Characterization and classification of semantic image-text relations
    Otto, Christian
    Springstein, Matthias
    Anand, Avishek
    Ewerth, Ralph
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2020, 9 (01) : 31 - 45
  • [28] Fusion layer attention for image-text matching
    Wang, Depeng
    Wang, Liejun
    Song, Shiji
    Huang, Gao
    Guo, Yuchen
    Cheng, Shuli
    Ao, Naixiang
    Du, Anyu
    NEUROCOMPUTING, 2021, 442 : 249 - 259
  • [29] Semantic Completion and Filtration for Image-Text Retrieval
    Yang, Song
    Li, Qiang
    Li, Wenhui
    Li, Xuan-Ya
    Jin, Ran
    Lv, Bo
    Wang, Rui
    Liu, Anan
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [30] CITE: A Corpus of Image-Text Discourse Relations
    Alikhani, Malihe
    Chowdhury, Sreyasi Nag
    De Melo, Gerard
    Stone, Matthew
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 570 - 575