A Parallel and Efficient Algorithm for Learning to Match

被引:3
|
作者
Shang, Jingbo [1 ,4 ]
Chen, Tianqi [2 ]
Li, Hang [3 ]
Lu, Zhengdong [3 ]
Yu, Yong [4 ]
机构
[1] Univ Illinois, Champaign, IL 61801 USA
[2] Univ Washington, Seattle, WA 98195 USA
[3] Huawei Noahs Ark Lab, Hong Kong, Hong Kong, Peoples R China
[4] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2014年
关键词
MATRIX FACTORIZATION;
D O I
10.1109/ICDM.2014.71
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many tasks in data mining and related fields can be formalized as matching between objects in two heterogeneous domains, including collaborative filtering, link prediction, image tagging, and web search. Machine learning techniques, referred to as learning-to-match in this paper, have been successfully applied to the problems. Among them, a class of state-of-the-art methods, named feature-based matrix factorization, formalize the task as an extension to matrix factorization by incorporating auxiliary features into the model. Unfortunately, making those algorithms scale to real world problems is challenging, and simple parallelization strategies fail due to the complex cross talking patterns between sub-tasks. In this paper, we tackle this challenge with a novel parallel and efficient algorithm. Our algorithm, based on coordinate descent, can easily handle hundreds of millions of instances and features on a single machine. The key recipe of this algorithm is an iterative relaxation of the objective to facilitate parallel updates of parameters, with guaranteed convergence on minimizing the original objective function. Experimental results demonstrate that the proposed method is effective on a wide range of matching problems, with efficiency significantly improved upon the baselines while accuracy retained unchanged.
引用
收藏
页码:971 / 976
页数:6
相关论文
共 50 条
  • [1] An efficient algorithm for parallel distributed unsupervised learning
    Campobello, Giuseppe
    Patane, Giuseppe
    Russo, Marco
    NEUROCOMPUTING, 2008, 71 (13-15) : 2914 - 2928
  • [2] Lana-Match algorithm: a parallel version of the Rete-Match algorithm
    Aref, MM
    Tayyib, MA
    PARALLEL COMPUTING, 1998, 24 (5-6) : 763 - 775
  • [3] An efficient implementation of a backpropagation learning algorithm on a Quadrics parallel supercomputer
    Taraglio, S
    Massaioli, F
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 664 - 671
  • [4] AN EFFICIENT PARALLEL SORTING ALGORITHM
    LIU, XQ
    KIM, JL
    INFORMATION PROCESSING LETTERS, 1992, 43 (03) : 129 - 133
  • [5] AN EFFICIENT PARALLEL BICONNECTIVITY ALGORITHM
    TARJAN, RE
    VISHKIN, U
    SIAM JOURNAL ON COMPUTING, 1985, 14 (04) : 862 - 874
  • [6] An Efficient Parallel Pursuit Algorithm
    Ge, Hao
    Guo, Ying
    Li, Shenghong
    2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 1, 2016, : 587 - 591
  • [7] An Efficient Parallel Algorithm for FFT
    乔香珍
    JournalofComputerScienceandTechnology, 1987, (03) : 174 - 190
  • [8] An efficient parallel scheduling algorithm
    Wu, MY
    EIGHTH IEEE SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1996, : 258 - 261
  • [9] Efficient parallel analysis algorithm
    Hu, Yonggang
    Qiao, Ruliang
    Jisuanji Xuebao/Chinese Journal of Computers, 1999, 22 (02): : 134 - 140
  • [10] AN EFFICIENT PARALLEL ALGORITHM FOR PLANARITY
    KLEIN, PN
    REIF, JH
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1988, 37 (02) : 190 - 246