Image Recall on Image-Text Intertwined Lifelogs

被引:6
|
作者
Chu, Tzu-Hsuan [1 ]
Huang, Hen-Hsen [2 ]
Chen, Hsin-Hsi [2 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Natinal Chengchi Univ, MOST Joint Res Ctr Technol & All Vista Healthcare, Taipei, Taiwan
关键词
lifelogging; image retrieval; multimodal representation;
D O I
10.1145/3350546.3352555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
People engage in lifelogging by taking photos with cameras and cellphones anytime anywhere and share the photos, intertwined with captions or descriptions, on social media platforms. The image-text intertwined data provides richer information for image recall. When images cannot keep the complete information, the textual information is a complement to describe the life experiences under the photos. This work proposes a multimodal retrieval model for image recall in image-text intertwined lifelogs. Our Attentive Image-Story model combines an Image model, which transfers visual information and textual information to a single representation space, and a Story model, which captures text-based contextual information, with an attention mechanism to reduce the semantic gap between visual and textual information. Experimental results show our model outperforms a state-of-the-art image-based retrieval model and the image/text hybrid system.
引用
收藏
页码:398 / 402
页数:5
相关论文
共 50 条
  • [31] IMAGE-TEXT MATCHING WITH SHARED SEMANTIC CONCEPTS
    Miao Lanxin
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [32] Stacked Cross Attention for Image-Text Matching
    Lee, Kuang-Huei
    Chen, Xi
    Hua, Gang
    Hu, Houdong
    He, Xiaodong
    COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 212 - 228
  • [33] Giving Text More Imagination Space for Image-text Matching
    Dong, Xinfeng
    Han, Longfei
    Zhang, Dingwen
    Liu, Li
    Han, Junwei
    Zhang, Huaxiang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6359 - 6368
  • [34] Image-Text Surgery: Efficient Concept Learning in Image Captioning by Generating Pseudopairs
    Fu, Kun
    Li, Jin
    Jin, Junqi
    Zhang, Changshui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5910 - 5921
  • [35] Modality-Invariant Image-Text Embedding for Image-Sentence Matching
    Liu, Ruoyu
    Zhao, Yao
    Wei, Shikui
    Zheng, Liang
    Yang, Yi
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (01)
  • [36] Image-Text Fusion Sentiment Analysis Method Based on Image Semantic Translation
    Huang, Jian
    Wang, Ying
    Computer Engineering and Applications, 2023, 59 (11) : 180 - 187
  • [37] Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
    Biten, Ali Furkan
    Mafla, Andres
    Gomez, Lluis
    Karatzas, Dimosthenis
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2483 - 2492
  • [38] Large-scale image annotation with image-text hybrid learning models
    Chien, Been-Chian
    Ku, Chia-Wei
    SOFT COMPUTING, 2017, 21 (11) : 2857 - 2869
  • [39] Hashing based Efficient Inference for Image-Text Matching
    Tu, Rong-Cheng
    Ji, Lei
    Luo, Huaishao
    Shi, Botian
    Huang, Heyan
    Duan, Nan
    Mao, Xian-Ling
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 743 - 752
  • [40] Towards Deconfounded Image-Text Matching with Causal Inference
    Li, Wenhui
    Su, Xinqi
    Song, Dan
    Wang, Lanjun
    Zhang, Kun
    Liu, An-An
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6264 - 6273