Learning Query and Document Relevance from a Web-scale Click Graph

被引:32
|
作者
Jiang, Shan [1 ]
Hu, Yuening [2 ]
Kang, Changsung [2 ]
Daly, Tim, Jr. [2 ]
Yin, Dawei [2 ]
Chang, Yi [2 ]
Zhai, Chengxiang [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Yahoo Res, Sunnyvale, CA USA
关键词
Click-through bipartite graph; vector propagation; vector generation; Web search; query-document relevance;
D O I
10.1145/2911451.2911531
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Click-through logs over query-document pairs provide rich and valuable information for multiple tasks in information retrieval. This paper proposes a vector propagation algorithm on the click graph to learn vector representations for both queries and documents in the same semantic space. The proposed approach incorporates both click and content information, and the produced vector representations can directly improve ranking performance for queries and documents that have been observed in the click log. For new queries and documents that are not in the click log, we propose a two-step framework to generate the vector representation, which significantly improves the coverage of our vectors while maintaining the high quality. Experiments on Web-scale search logs from a major commercial search engine demonstrate the effectiveness and scalability of the proposed method. Evaluation results show that NDCG scores are significantly improved against multiple baselines by using the proposed method both as a ranking model and as a feature in a learning-to-rank framework.
引用
收藏
页码:185 / 194
页数:10
相关论文
共 50 条
  • [31] MARES: multitask learning algorithm for Web-scale real-time event summarization
    Yang, Min
    Tu, Wenting
    Qu, Qiang
    Lei, Kai
    Chen, Xiaojun
    Zhu, Jia
    Shen, Ying
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 499 - 515
  • [32] M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems
    Wang, Menghan
    Lin, Yujie
    Lin, Guli
    Yang, Keping
    Wu, Xiao-ming
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2349 - 2358
  • [33] CWRCzech: 100M Query-Document Czech Click Dataset and Its Application toWeb Relevance Ranking
    Vonasek, Josef
    Straka, Milan
    Krc, Rostislav
    Lasonova, Lenka
    Egorova, Ekaterina
    Strakova, Jana
    Naplava, Jakub
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 1221 - 1231
  • [34] Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
    Panchenko, Alexander
    Ruppert, Eugen
    Faralli, Stefano
    Ponzetto, Simone P.
    Biemann, Chris
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1816 - 1823
  • [35] Efficient Learning to Learn a Robust CTR Model for Web-scale Online Sponsored Search Advertising
    Wang, Xin
    Yang, Peng
    Chen, Shaopeng
    Liu, Lin
    Zhao, Lian
    Guo, Jiacheng
    Sun, Mingming
    Li, Ping
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4203 - 4213
  • [36] Evento 360: Social Event Discovery from Web-scale Multimedia Collection
    Choi, Jaeyoung
    Kim, Eungchan
    Larson, Martha
    Friedland, Gerald
    Hanjalic, Alan
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 193 - 196
  • [37] Learning a unified embedding space of web search from large-scale query log
    Bing, Lidong
    Niu, Zheng-Yu
    Li, Piji
    Lam, Wai
    Wang, Haifeng
    KNOWLEDGE-BASED SYSTEMS, 2018, 150 : 38 - 48
  • [38] MPGraf: a Modular and Pre-trained Graphformer for Learning to Rank at Web-Scale (Extended Abstract)
    Li, Yuchen
    Xiong, Haoyi
    Kong, Linghe
    Sun, Zeyi
    Chen, Hongyang
    Wang, Shuaiqiang
    Yin, Dawei
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8439 - 8443
  • [39] Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
    Iscen, Ahmet
    Fathi, Alireza
    Schmid, Cordelia
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19295 - 19304
  • [40] Beyond Bag-of-Words: Machine Learning for Query-Document Matching in Web Search
    Li, Hang
    Xu, Jun
    SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 1177 - 1177