Image Recall on Image-Text Intertwined Lifelogs

被引:6
|
作者
Chu, Tzu-Hsuan [1 ]
Huang, Hen-Hsen [2 ]
Chen, Hsin-Hsi [2 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Natinal Chengchi Univ, MOST Joint Res Ctr Technol & All Vista Healthcare, Taipei, Taiwan
关键词
lifelogging; image retrieval; multimodal representation;
D O I
10.1145/3350546.3352555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
People engage in lifelogging by taking photos with cameras and cellphones anytime anywhere and share the photos, intertwined with captions or descriptions, on social media platforms. The image-text intertwined data provides richer information for image recall. When images cannot keep the complete information, the textual information is a complement to describe the life experiences under the photos. This work proposes a multimodal retrieval model for image recall in image-text intertwined lifelogs. Our Attentive Image-Story model combines an Image model, which transfers visual information and textual information to a single representation space, and a Story model, which captures text-based contextual information, with an attention mechanism to reduce the semantic gap between visual and textual information. Experimental results show our model outperforms a state-of-the-art image-based retrieval model and the image/text hybrid system.
引用
收藏
页码:398 / 402
页数:5
相关论文
共 50 条
  • [1] Image-Text Interaction
    Strothotte, Thomas
    2007 INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2007, : 3 - 3
  • [2] Text-image communication, image-text communication
    Münkner, J
    ZEITSCHRIFT FUR GERMANISTIK, 2004, 14 (02): : 454 - 455
  • [3] Image-text interaction graph neural network for image-text sentiment analysis
    Wenxiong Liao
    Bi Zeng
    Jianqi Liu
    Pengfei Wei
    Jiongkun Fang
    Applied Intelligence, 2022, 52 : 11184 - 11198
  • [4] Image-text interaction graph neural network for image-text sentiment analysis
    Liao, Wenxiong
    Zeng, Bi
    Liu, Jianqi
    Wei, Pengfei
    Fang, Jiongkun
    APPLIED INTELLIGENCE, 2022, 52 (10) : 11184 - 11198
  • [5] Learning Image-Text Associations
    Jiang, Tao
    Tan, Ah-Hwee
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (02) : 161 - 177
  • [6] The image-text as textual interaction
    MacLeod, C
    GERMANIC REVIEW, 1999, 74 (03): : 257 - 260
  • [7] Hyperbolic Image-Text Representations
    Desai, Karan
    Nickel, Maximilian
    Rajpurohit, Tanmay
    Johnson, Justin
    Vedantam, Ramakrishna
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [8] BIT: Improving Image-text Sentiment Analysis via Learning Bidirectional Image-text Interaction
    Xiao, Xingwang
    Pu, Yuanyuan
    Zhao, Zhengpeng
    Gu, Jinjing
    Xu, Dan
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [9] Compositional Learning of Image-Text Query for Image Retrieval
    Anwaar, Muhammad Umer
    Labintcev, Egor
    Kleinsteuber, Martin
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1139 - 1148
  • [10] Text-to-Image Generation Method Based on Image-Text Semantic Consistency
    Xue Z.
    Xu Z.
    Lang C.
    Feng S.
    Wang T.
    Li Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (09): : 2180 - 2190