Cross-modal recipe retrieval via parallel- and cross-attention networks learning

被引:10
|
作者
Cao, Da [1 ]
Chu, Jingjing [1 ]
Zhu, Ningbo [1 ]
Nie, Liqiang [2 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266000, Shandong, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Recipe retrieval; Parallel-attention network; Cross-attention network; Cross-modal retrieval;
D O I
10.1016/j.knosys.2019.105428
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal recipe retrieval refers to the problem of retrieving a food image from a list of image candidates given a textual recipe as the query, or the reverse side. However, existing cross-modal recipe retrieval approaches mostly focus on learning the representations of images and recipes independently and sewing them up by projecting them into a common space. Such methods overlook the interplay between images and recipes, resulting in the suboptimal retrieval performance. Toward this end, we study the problem of cross-modal recipe retrieval from the viewpoint of parallel- and cross-attention networks learning. Specifically, we first exploit a parallel-attention network to independently learn the attention weights of components in images and recipes. Thereafter, a cross-attention network is proposed to explicitly learn the interplay between images and recipes, which simultaneously considers word-guided image attention and image-guided word attention. Lastly, the learnt representations of images and recipes stemming from parallel- and cross-attention networks are elaborately connected and optimized using a pairwise ranking loss. By experimenting on two datasets, we demonstrate the effectiveness and rationality of our proposed solution on the scope of both overall performance comparison and micro-level analyses. (c) 2019 Published by Elsevier B.V.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Iterative graph attention memory network for cross-modal retrieval
    Dong, Xinfeng
    Zhang, Huaxiang
    Dong, Xiao
    Lu, Xu
    KNOWLEDGE-BASED SYSTEMS, 2021, 226
  • [42] Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval
    Yu, Tan
    Yang, Yi
    Li, Yi
    Liu, Lin
    Fei, Hongliang
    Li, Ping
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1146 - 1156
  • [43] Adaptive Graph Attention Hashing for Unsupervised Cross-Modal Retrieval via Multimodal Transformers
    Li, Yewen
    Ge, Mingyuan
    Ji, Yucheng
    Li, Mingyong
    WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 1 - 15
  • [44] Boosting cross-modal retrieval in remote sensing via a novel unified attention network
    Choudhury, Shabnam
    Saini, Devansh
    Banerjee, Biplab
    NEURAL NETWORKS, 2024, 180
  • [45] Fine-Grained Correlation Learning with Stacked Co-attention Networks for Cross-Modal Information Retrieval
    Lu, Yuhang
    Yu, Jing
    Liu, Yanbing
    Tan, Jianlong
    Guo, Li
    Zhang, Weifeng
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2018), PT I, 2018, 11061 : 213 - 225
  • [46] Category Alignment Adversarial Learning for Cross-Modal Retrieval
    He, Shiyuan
    Wang, Weiyang
    Wang, Zheng
    Xu, Xing
    Yang, Yang
    Wang, Xiaoming
    Shen, Heng Tao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
  • [47] Adversarial cross-modal retrieval based on dictionary learning
    Shang, Fei
    Zhang, Huaxiang
    Zhu, Lei
    Sun, Jiande
    NEUROCOMPUTING, 2019, 355 : 93 - 104
  • [48] Heterogeneous Metric Learning for Cross-Modal Multimedia Retrieval
    Deng, Jun
    Du, Liang
    Shen, Yi-Dong
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT I, 2013, 8180 : 43 - 56
  • [49] Deep Multimodal Transfer Learning for Cross-Modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Peng, Xi
    Goh, Rick Siow Mong
    Zhou, Joey Tianyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 798 - 810
  • [50] Learning Relation Alignment for Calibrated Cross-modal Retrieval
    Ren, Shuhuai
    Lin, Junyang
    Zhao, Guangxiang
    Men, Rui
    Yang, An
    Zhou, Jingren
    Sun, Xu
    Yang, Hongxia
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 514 - 524