A Parallel and Efficient Algorithm for Learning to Match

被引:3
|
作者
Shang, Jingbo [1 ,4 ]
Chen, Tianqi [2 ]
Li, Hang [3 ]
Lu, Zhengdong [3 ]
Yu, Yong [4 ]
机构
[1] Univ Illinois, Champaign, IL 61801 USA
[2] Univ Washington, Seattle, WA 98195 USA
[3] Huawei Noahs Ark Lab, Hong Kong, Hong Kong, Peoples R China
[4] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2014年
关键词
MATRIX FACTORIZATION;
D O I
10.1109/ICDM.2014.71
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many tasks in data mining and related fields can be formalized as matching between objects in two heterogeneous domains, including collaborative filtering, link prediction, image tagging, and web search. Machine learning techniques, referred to as learning-to-match in this paper, have been successfully applied to the problems. Among them, a class of state-of-the-art methods, named feature-based matrix factorization, formalize the task as an extension to matrix factorization by incorporating auxiliary features into the model. Unfortunately, making those algorithms scale to real world problems is challenging, and simple parallelization strategies fail due to the complex cross talking patterns between sub-tasks. In this paper, we tackle this challenge with a novel parallel and efficient algorithm. Our algorithm, based on coordinate descent, can easily handle hundreds of millions of instances and features on a single machine. The key recipe of this algorithm is an iterative relaxation of the objective to facilitate parallel updates of parameters, with guaranteed convergence on minimizing the original objective function. Experimental results demonstrate that the proposed method is effective on a wide range of matching problems, with efficiency significantly improved upon the baselines while accuracy retained unchanged.
引用
收藏
页码:971 / 976
页数:6
相关论文
共 50 条
  • [21] Efficient parallel modular exponentiation algorithm
    Nedjah, N
    Mourelle, LD
    ADVANCES IN INFORMATION SYSTEMS, 2002, 2457 : 405 - 414
  • [22] An efficient fully parallel thinning algorithm
    Han, NH
    La, CW
    Rhee, PK
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 137 - 141
  • [23] Efficient Parallel Propagation Algorithm for FDE
    Li Z.
    Yu Z.-Z.
    Li Z.-S.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (09): : 4153 - 4166
  • [24] An efficient parallel termination detection algorithm
    Baker, A. H.
    Crivelli, S.
    Jessup, E. R.
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2006, 21 (04) : 293 - 301
  • [25] AN EFFICIENT PARALLEL CRITICAL PATH ALGORITHM
    LIU, LR
    DU, DHC
    CHEN, HC
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1994, 13 (07) : 909 - 919
  • [26] Efficient MIMD parallel DFP algorithm
    Wang, Yanchun
    Zhu, Mingwu
    Nanjing Li Gong Daxue Xuebao/Journal of Nanjing University of Science and Technology, 1994, (06): : 1 - 5
  • [27] The pyramid match kernel: Efficient learning with sets of features
    Grauman, Kristen
    Darrell, Trevor
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 725 - 760
  • [28] The pyramid match kernel: Efficient learning with sets of features
    Grauman, Kristen
    Darrell, Trevor
    Journal of Machine Learning Research, 2007, 8 : 725 - 760
  • [29] Cross-Parallel Attention and Efficient Match Transformer for Aerial Tracking
    Deng, Anping
    Han, Guangliang
    Zhang, Zhongbo
    Chen, Dianbing
    Ma, Tianjiao
    Liu, Zhichao
    REMOTE SENSING, 2024, 16 (06)
  • [30] An efficient parallel neural network-based multi-instance learning algorithm
    Li, Cheng Hua
    Gondra, Iker
    Liu, Lijun
    JOURNAL OF SUPERCOMPUTING, 2012, 62 (02): : 724 - 740