MAKE: Vision-Language Pre-training based Product Retrieval in Taobao Search

被引：2

作者：

Zheng, Xiaoyang ^{[1
]}

Wang, Zilong ^{[1
]}

Li, Sen ^{[1
]}

Xu, Ke ^{[2
]}

Zhuang, Tao ^{[1
]}

Liu, Qingwen ^{[1
]}

Zeng, Xiaoyi ^{[1
]}

机构：

[1] Alibaba Grp, Hangzhou, Peoples R China

[2] City Univ Hong Kong, Hong Kong, Peoples R China

来源：

COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023 | 2023年

关键词：

Multimodal Pre-training; Semantic Retrieval; Representation Learning;

D O I：

10.1145/3543873.3584627

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Taobao Search consists of two phases: the retrieval phase and the ranking phase. Given a user query, the retrieval phase returns a subset of candidate products for the following ranking phase. Recently, the paradigm of pre-training and fine-tuning has shown its potential in incorporating visual clues into retrieval tasks. In this paper, we focus on solving the problem of text-to-multimodal retrieval in Taobao Search. We consider that users' attention on titles or images varies on products. Hence, we propose a novel Modal Adaptation module for cross-modal fusion, which helps assigns appropriate weights on texts and images across products. Furthermore, in ecommerce search, user queries tend to be brief and thus lead to significant semantic imbalance between user queries and product titles. Therefore, we design a separate text encoder and a Keyword Enhancement mechanism to enrich the query representations and improve text-to-multimodal matching. To this end, we present a novel vision-language (V+L) pre-training methods to exploit the multimodal information of (user query, product title, product image). Extensive experiments demonstrate that our retrieval-specific pre-training model (referred to as MAKE) outperforms existing V+L pre-training methods on the text-to-multimodal retrieval task. MAKE has been deployed online and brings major improvements on the retrieval system of Taobao Search.

引用

页码：356 / 360

页数：5

共 50 条

[1] Delving into E-Commerce Product Retrieval with Vision-Language Pre-training
Zheng, Xiaoyang
Lv, Fuyu
Wang, Zilong
Liu, Qingwen
Zeng, Xiaoyi
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3385 - 3389
[2] Survey on Vision-language Pre-training
Yin J.
Zhang Z.-D.
Gao Y.-H.
Yang Z.-W.
Li L.
Xiao M.
Sun Y.-Q.
Yan C.-G.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2000 - 2023
[3] VLP: A Survey on Vision-language Pre-training
Chen, Fei-Long
Zhang, Du-Zhen
Han, Ming-Lun
Chen, Xiu-Yi
Shi, Jing
Xu, Shuang
Xu, Bo
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (01) : 38 - 56
[4] VLP: A Survey on Vision-language Pre-training
Fei-Long Chen
Du-Zhen Zhang
Ming-Lun Han
Xiu-Yi Chen
Jing Shi
Shuang Xu
Bo Xu
Machine Intelligence Research, 2023, 20 (01) : 38 - 56
[5] VLP: A Survey on Vision-language Pre-training
Fei-Long Chen
Du-Zhen Zhang
Ming-Lun Han
Xiu-Yi Chen
Jing Shi
Shuang Xu
Bo Xu
Machine Intelligence Research, 2023, 20 : 38 - 56
[6] GilBERT: Generative Vision-Language Pre-Training for Image-Text Retrieval
Hong, Weixiang
Ji, Kaixiang
Liu, Jiajia
Wang, Jian
Chen, Jingdong
Chu, Wei
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1379 - 1388
[7] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Jian, Yiren
Gao, Chongyang
Vosoughi, Soroush
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[8] Pre-training A Prompt Pool for Vision-Language Model
Liu, Jun
Gu, Yang
Yang, Zhaohua
Guo, Shuai
Liu, Huaqiu
Chen, Yiqiang
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[9] Vision-language pre-training via modal interaction
Cheng, Hang
Ye, Hehui
Zhou, Xiaofei
Liu, Ximeng
Chen, Fei
Wang, Meiqing
PATTERN RECOGNITION, 2024, 156
[10] Contrastive Vision-Language Pre-training with Limited Resources
Cui, Quan
Zhou, Boyan
Guo, Yu
Yin, Weidong
Wu, Hao
Yoshie, Osamu
Chen, Yubo
COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 236 - 253

← 1 2 3 4 5 →