Learning Query and Document Relevance from a Web-scale Click Graph

被引:32
|
作者
Jiang, Shan [1 ]
Hu, Yuening [2 ]
Kang, Changsung [2 ]
Daly, Tim, Jr. [2 ]
Yin, Dawei [2 ]
Chang, Yi [2 ]
Zhai, Chengxiang [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Yahoo Res, Sunnyvale, CA USA
关键词
Click-through bipartite graph; vector propagation; vector generation; Web search; query-document relevance;
D O I
10.1145/2911451.2911531
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Click-through logs over query-document pairs provide rich and valuable information for multiple tasks in information retrieval. This paper proposes a vector propagation algorithm on the click graph to learn vector representations for both queries and documents in the same semantic space. The proposed approach incorporates both click and content information, and the produced vector representations can directly improve ranking performance for queries and documents that have been observed in the click log. For new queries and documents that are not in the click log, we propose a two-step framework to generate the vector representation, which significantly improves the coverage of our vectors while maintaining the high quality. Experiments on Web-scale search logs from a major commercial search engine demonstrate the effectiveness and scalability of the proposed method. Evaluation results show that NDCG scores are significantly improved against multiple baselines by using the proposed method both as a ranking model and as a feature in a learning-to-rank framework.
引用
收藏
页码:185 / 194
页数:10
相关论文
共 50 条
  • [21] Robust and Distributed Web-Scale Near-Dup Document Conflation in Microsoft Academic Service
    Wu, Chieh-Han
    Song, Yang
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2606 - 2611
  • [22] Meta-Learning for Query Conceptualization at Web Scale
    Han, Fred X.
    Niu, Di
    Chen, Haolan
    Guo, Weidong
    Yan, Shengli
    Long, Bowei
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3064 - 3073
  • [23] MPGraf: a Modular and Pre-trained Graphformer for Learning to Rank at Web-scale
    Li, Yuchen
    Xiong, Haoyi
    Kong, Linghe
    Sun, Zeyi
    Chen, Hongyang
    Wang, Shuaiqiang
    Yin, Dawei
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 339 - 348
  • [24] WEAKLY SUPERVISED MULTISCALE-INCEPTION LEARNING FOR WEB-SCALE FACE RECOGNITION
    Cheng, Cheng
    Xing, Junliang
    Feng, Youji
    Liu, Pengcheng
    Shao, Xiaohu
    Li, Kai
    Zhou, Xiang-Dong
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 815 - 819
  • [26] An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs
    Cormode, Graham
    Dasgupta, Anirban
    Goyal, Amit
    Lee, Chi Hoon
    PLOS ONE, 2018, 13 (01):
  • [27] Web-Scale Extension of RDF Knowledge Bases from Templated Websites
    Buehmann, Lorenz
    Usbeck, Ricardo
    Ngomo, Axel-Cyrille Ngonga
    Saleem, Muhammad
    Both, Andreas
    Crescenzi, Valter
    Merialdo, Paolo
    Qiu, Disheng
    SEMANTIC WEB - ISWC 2014, PT I, 2014, 8796 : 66 - 81
  • [28] Generalized zero-shot learning for action recognition with web-scale video data
    Liu, Kun
    Liu, Wu
    Ma, Huadong
    Huang, Wenbing
    Dong, Xiongxiong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 807 - 824
  • [29] MARES: multitask learning algorithm for Web-scale real-time event summarization
    Min Yang
    Wenting Tu
    Qiang Qu
    Kai Lei
    Xiaojun Chen
    Jia Zhu
    Ying Shen
    World Wide Web, 2019, 22 : 499 - 515
  • [30] Generalized zero-shot learning for action recognition with web-scale video data
    Kun Liu
    Wu Liu
    Huadong Ma
    Wenbing Huang
    Xiongxiong Dong
    World Wide Web, 2019, 22 : 807 - 824