Web-Scale Semantic Product Search with Large Language Models

被引:3
|
作者
Muhamed, Aashiq [1 ]
Srinivasan, Sriram [1 ]
Teo, Choon-Hui [1 ]
Cui, Qingjun [1 ]
Zeng, Belinda [2 ]
Chilimbi, Trishul [2 ]
Vishwanathan, S. V. N. [1 ]
机构
[1] Amazon, Palo Alto, CA 94303 USA
[2] Amazon, Seattle, WA USA
关键词
Matching; Retrieval; Search; Pretrained Language Models;
D O I
10.1007/978-3-031-33380-4_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dense embedding-based semantic matching is widely used in e-commerce product search to address the shortcomings of lexical matching such as sensitivity to spelling variants. The recent advances in BERT-like language model encoders, have however, not found their way to realtime search due to the strict inference latency requirement imposed on e-commerce websites. While bi-encoder BERT architectures enable fast approximate nearest neighbor search, training them effectively on query-product data remains a challenge due to training instabilities and the persistent generalization gap with cross-encoders. In this work, we propose a four-stage training procedure to leverage large BERT-like models for product search while preserving low inference latency. We introduce query-product interaction pre-finetuning to effectively pretrain BERT bi-encoders for matching and improve generalization. Through offline experiments on an e-commerce product dataset, we show that a distilled small BERT-based model (75M params) trained using our approach improves the search relevance metric by up to 23% over a baseline DSSM-based model with similar inference latency. The small model only suffers a 3% drop in relevance metric compared to the 20x larger teacher. We also show using online A/B tests at scale, that our approach improves over the production model in exact and substitute products retrieved.
引用
收藏
页码:73 / 85
页数:13
相关论文
共 50 条
  • [1] DISTRIBUTED WEB-SCALE INFRASTRUCTURE FOR CRAWLING, INDEXING AND SEARCH WITH SEMANTIC SUPPORT
    Dlugolinsky, Stefan
    Seleng, Martin
    Laclavik, Michal
    Hluchy, Ladislav
    COMPUTER SCIENCE-AGH, 2012, 13 (04): : 5 - 19
  • [2] Web-scale semantic information processing
    Heflin, Jeff
    Stuckenschmidt, Heiner
    JOURNAL OF WEB SEMANTICS, 2012, 10 : 1 - 2
  • [3] Web-Scale Responsive Visual Search at Bing
    Hu, Houdong
    Wang, Yan
    Yang, Linjun
    Komlev, Pavel
    Huang, Li
    Chen, Xi
    Huang, Jiapei
    Wu, Ye
    Merchant, Meenaz
    Sacheti, Arun
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 359 - 367
  • [4] Semantic Rule Filtering for Web-Scale Relation Extraction
    Moro, Andrea
    Li, Hong
    Krause, Sebastian
    Xu, Feiyu
    Navigli, Roberto
    Uszkoreit, Hans
    SEMANTIC WEB - ISWC 2013, PART I, 2013, 8218 : 347 - 362
  • [5] Semantic Mechanical Search with Large Vision and Language Models
    Sharma, Satvik
    Huang, Huang
    Shivakumar, Kaushik
    Chen, Lawrence Yunliang
    Hoque, Ryan
    Ichter, Brian
    Goldberg, Ken
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [6] Building web-scale data mining infrastructure for search
    Ma, Wei-Ying
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 9 - 9
  • [7] Web-scale Multimedia Search for Internet Video Content
    Jiang, Lu
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16 COMPANION), 2016, : 311 - 316
  • [8] Building a web-scale image similarity search system
    Michal Batko
    Fabrizio Falchi
    Claudio Lucchese
    David Novak
    Raffaele Perego
    Fausto Rabitti
    Jan Sedmidubsky
    Pavel Zezula
    Multimedia Tools and Applications, 2010, 47 : 599 - 629
  • [9] Building a web-scale image similarity search system
    Batko, Michal
    Falchi, Fabrizio
    Lucchese, Claudio
    Novak, David
    Perego, Raffaele
    Rabitti, Fausto
    Sedmidubsky, Jan
    Zezula, Pavel
    MULTIMEDIA TOOLS AND APPLICATIONS, 2010, 47 (03) : 599 - 629
  • [10] Web-scale Multimedia Search for Internet Video Content
    Jiang, Lu
    PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16), 2016, : 701 - 701