Web-Scale Semantic Product Search with Large Language Models

被引:3
|
作者
Muhamed, Aashiq [1 ]
Srinivasan, Sriram [1 ]
Teo, Choon-Hui [1 ]
Cui, Qingjun [1 ]
Zeng, Belinda [2 ]
Chilimbi, Trishul [2 ]
Vishwanathan, S. V. N. [1 ]
机构
[1] Amazon, Palo Alto, CA 94303 USA
[2] Amazon, Seattle, WA USA
关键词
Matching; Retrieval; Search; Pretrained Language Models;
D O I
10.1007/978-3-031-33380-4_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dense embedding-based semantic matching is widely used in e-commerce product search to address the shortcomings of lexical matching such as sensitivity to spelling variants. The recent advances in BERT-like language model encoders, have however, not found their way to realtime search due to the strict inference latency requirement imposed on e-commerce websites. While bi-encoder BERT architectures enable fast approximate nearest neighbor search, training them effectively on query-product data remains a challenge due to training instabilities and the persistent generalization gap with cross-encoders. In this work, we propose a four-stage training procedure to leverage large BERT-like models for product search while preserving low inference latency. We introduce query-product interaction pre-finetuning to effectively pretrain BERT bi-encoders for matching and improve generalization. Through offline experiments on an e-commerce product dataset, we show that a distilled small BERT-based model (75M params) trained using our approach improves the search relevance metric by up to 23% over a baseline DSSM-based model with similar inference latency. The small model only suffers a 3% drop in relevance metric compared to the 20x larger teacher. We also show using online A/B tests at scale, that our approach improves over the production model in exact and substitute products retrieved.
引用
收藏
页码:73 / 85
页数:13
相关论文
共 50 条
  • [21] Faceted product search powered by the Semantic Web
    Vandic, Damir
    van Dam, Jan-Willem
    Frasincar, Flavius
    DECISION SUPPORT SYSTEMS, 2012, 53 (03) : 425 - 437
  • [22] Modeling Search Assistance Mechanisms within Web-Scale Discovery Systems
    Mischo, William H.
    Schlembach, Mary C.
    Norman, Michael A.
    JCDL'13: PROCEEDINGS OF THE 13TH ACM/IEEE-CS JOINT CONFERENCE ON DIGITAL LIBRARIES, 2013, : 407 - 408
  • [23] Web-scale distributed AI search across disconnected and heterogeneous infrastructures
    Kelsey, Tom
    McCaffery, Martin
    Kotthoff, Lars
    2014 IEEE 10TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE), VOL 1, 2014, : 39 - 46
  • [24] Web-Scale N-gram Models for Lexical Disambiguation
    Bergsma, Shane
    Lin, Dekang
    Goebel, Randy
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1507 - 1512
  • [25] EFFICACY OF A CONSTANTLY ADAPTIVE LANGUAGE MODELING TECHNIQUE FOR WEB-SCALE APPLICATIONS
    Wang, Kuansan
    Li, Xiaolong
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4733 - 4736
  • [26] Search Query Quality and Web-Scale Discovery: A Qualitative and Quantitative Analysis
    Meadow, Kelly
    Meadow, James
    COLLEGE & UNDERGRADUATE LIBRARIES, 2012, 19 (2-4) : 163 - 175
  • [27] Drinking From a Firehose: Continual Learning With Web-Scale Natural Language
    Hu, Hexiang
    Sener, Ozan
    Sha, Fei
    Koltun, Vladlen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5684 - 5696
  • [28] Web-Scale Image Annotation
    Liu, Jiakai
    Hu, Rong
    Wang, Meihong
    Wang, Yi
    Chang, Edward Y.
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 663 - 674
  • [29] Exploring the use of large language models to build product Kansei semantic spaces
    Alcaide-Marzal, Jorge
    Diego-Mas, Jose Antonio
    INTERNATIONAL JOURNAL OF INDUSTRIAL ERGONOMICS, 2025, 107
  • [30] Web-scale Knowledge Collection
    Lockard, Colin
    Shiralkar, Prashant
    Dong, Xin Luna
    Hajishirzi, Hannaneh
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 888 - 889