Cross-modal recipe retrieval via parallel- and cross-attention networks learning

被引：10

作者：

Cao, Da ^{[1
]}

Chu, Jingjing ^{[1
]}

Zhu, Ningbo ^{[1
]}

Nie, Liqiang ^{[2
]}

机构：

[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China

[2] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266000, Shandong, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2020年 / 193卷

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

Recipe retrieval; Parallel-attention network; Cross-attention network; Cross-modal retrieval;

D O I：

10.1016/j.knosys.2019.105428

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cross-modal recipe retrieval refers to the problem of retrieving a food image from a list of image candidates given a textual recipe as the query, or the reverse side. However, existing cross-modal recipe retrieval approaches mostly focus on learning the representations of images and recipes independently and sewing them up by projecting them into a common space. Such methods overlook the interplay between images and recipes, resulting in the suboptimal retrieval performance. Toward this end, we study the problem of cross-modal recipe retrieval from the viewpoint of parallel- and cross-attention networks learning. Specifically, we first exploit a parallel-attention network to independently learn the attention weights of components in images and recipes. Thereafter, a cross-attention network is proposed to explicitly learn the interplay between images and recipes, which simultaneously considers word-guided image attention and image-guided word attention. Lastly, the learnt representations of images and recipes stemming from parallel- and cross-attention networks are elaborately connected and optimized using a pairwise ranking loss. By experimenting on two datasets, we demonstrate the effectiveness and rationality of our proposed solution on the scope of both overall performance comparison and micro-level analyses. (c) 2019 Published by Elsevier B.V.

引用

页数：12

共 50 条

[41] Iterative graph attention memory network for cross-modal retrieval
Dong, Xinfeng
Zhang, Huaxiang
Dong, Xiao
Lu, Xu
KNOWLEDGE-BASED SYSTEMS, 2021, 226
[42] Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval
Yu, Tan
Yang, Yi
Li, Yi
Liu, Lin
Fei, Hongliang
Li, Ping
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1146 - 1156
[43] Adaptive Graph Attention Hashing for Unsupervised Cross-Modal Retrieval via Multimodal Transformers
Li, Yewen
Ge, Mingyuan
Ji, Yucheng
Li, Mingyong
WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 1 - 15
[44] Boosting cross-modal retrieval in remote sensing via a novel unified attention network
Choudhury, Shabnam
Saini, Devansh
Banerjee, Biplab
NEURAL NETWORKS, 2024, 180
[45] Fine-Grained Correlation Learning with Stacked Co-attention Networks for Cross-Modal Information Retrieval
Lu, Yuhang
Yu, Jing
Liu, Yanbing
Tan, Jianlong
Guo, Li
Zhang, Weifeng
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2018), PT I, 2018, 11061 : 213 - 225
[46] Category Alignment Adversarial Learning for Cross-Modal Retrieval
He, Shiyuan
Wang, Weiyang
Wang, Zheng
Xu, Xing
Yang, Yang
Wang, Xiaoming
Shen, Heng Tao
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
[47] Adversarial cross-modal retrieval based on dictionary learning
Shang, Fei
Zhang, Huaxiang
Zhu, Lei
Sun, Jiande
NEUROCOMPUTING, 2019, 355 : 93 - 104
[48] Heterogeneous Metric Learning for Cross-Modal Multimedia Retrieval
Deng, Jun
Du, Liang
Shen, Yi-Dong
WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT I, 2013, 8180 : 43 - 56
[49] Deep Multimodal Transfer Learning for Cross-Modal Retrieval
Zhen, Liangli
Hu, Peng
Peng, Xi
Goh, Rick Siow Mong
Zhou, Joey Tianyi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 798 - 810
[50] Learning Relation Alignment for Calibrated Cross-modal Retrieval
Ren, Shuhuai
Lin, Junyang
Zhao, Guangxiang
Men, Rui
Yang, An
Zhou, Jingren
Sun, Xu
Yang, Hongxia
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 514 - 524

← 1 2 3 4 5 →