Learning Large Scale Ordinal Ranking Model via Divide-and-Conquer Technique

被引：0

作者：

Tang, Lu ^{[1
]}

Chaudhuri, Sougata ^{[2
]}

Bagherjeiran, Abraham ^{[2
]}

Zhou, Ling ^{[1
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] A9 Com Inc, Palo Alto, CA USA

来源：

COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018) | 2018年

关键词：

Binary Classification; Ordinal Ranking; Big Data;

D O I：

10.1145/3184558.3191658

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Structured prediction, where outcomes have a precedence order, lies at the heart of machine learning for information retrieval, movie recommendation, product review prediction, and digital advertising. Ordinal ranking, in particular, assumes that the structured response has a linear ranked order. Due to the extensive applicability of these models, substantial research has been devoted to understanding them, as well as developing efficient training techniques. One popular and widely cited technique of training ordinal ranking models is to exploit the linear precedence order and systematically reduce it to a binary classification problem. This facilitates the usage of readily available, powerful binary classifiers, but necessitates an expansion of the original training data, where the training data increases by K - 1 times of its original size, with K being the number of ordinal classes. Due to prevalent nature of problems with large number of ordered classes, the reduction leads to datasets which are too large to train on single machines. While approximation methods like stochastic gradient descent are typically applied here, we investigate exact optimization solutions that can scale. In this paper, we present a divide-and-conquer (DC) algorithm, which divides large scale binary classification data into a cluster of machines and trains logistic models in parallel, and combines them at the end of the training phase to create a single binary classifier, which can then be used as an ordinal ranker. It requires no synchronization between the parallel learning algorithms during the training period, which makes training on large datasets feasible and efficient. We prove consistency and asymptotic normality property of the learned models using our proposed algorithm. We provide empirical evidence, on various ordinal datasets, of improved estimation and prediction performance of the model learnt using our algorithm, over several standard divide-and-conquer algorithms.

引用

页码：1901 / 1909

页数：9

共 50 条

[21] A Divide-and-Conquer Evolutionary Algorithm for Large-Scale Virtual Network Embedding
Song, An
Chen, Wei-Neng
Gong, Yue-Jiao
Luo, Xiaonan
Zhang, Jun
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2020, 24 (03) : 566 - 580
[22] Gaussian Process Learning: A Divide-and-Conquer Approach
Li, Wenye
ADVANCES IN NEURAL NETWORKS - ISNN 2014, 2014, 8866 : 262 - 269
[23] CASCADING DIVIDE-AND-CONQUER - A TECHNIQUE FOR DESIGNING PARALLEL ALGORITHMS
ATALLAH, MJ
COLE, R
GOODRICH, MT
SIAM JOURNAL ON COMPUTING, 1989, 18 (03) : 499 - 532
[24] A Divide-and-Conquer Method for Scalable Robust Multitask Learning
Pan, Yan
Xia, Rongkai
Yin, Jian
Liu, Ning
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (12) : 3163 - 3175
[25] Divide-and-conquer approximation algorithms via spreading metrics
Even, G
Naor, J
Rao, S
Schieber, B
JOURNAL OF THE ACM, 2000, 47 (04) : 585 - 616
[26] Large-scale image colorization based on divide-and-conquer support vector machines
Chen, Bo-Wei
He, Xinyu
Ji, Wen
Rho, Seungmin
Kung, Sun-Yuan
JOURNAL OF SUPERCOMPUTING, 2016, 72 (08): : 2942 - 2961
[27] Privacy-Preserving and Secure Divide-and-Conquer Learning
Brown, Lewis C. L.
Li, Qinghua
2022 IEEE/ACM 7TH SYMPOSIUM ON EDGE COMPUTING (SEC 2022), 2022, : 510 - 515
[28] A divide-and-conquer approach to geometric sampling for active learning
Cao, Xiaofeng
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 140 (140)
[29] Divide-and-Conquer Learning with Nystrom: Optimal Rate and Algorithm
Yin, Rong
Liu, Yong
Lu, Lijing
Wang, Weiping
Meng, Dan
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6696 - 6703
[30] Large-scale image colorization based on divide-and-conquer support vector machines
Bo-Wei Chen
Xinyu He
Wen Ji
Seungmin Rho
Sun-Yuan Kung
The Journal of Supercomputing, 2016, 72 : 2942 - 2961

← 1 2 3 4 5 →