Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC

被引:26
|
作者
Ahn, Sungjin [1 ]
Korattikara, Anoop [2 ]
Liu, Nathan [3 ]
Rajan, Suju [3 ]
Welling, Max [4 ]
机构
[1] Univ Calif Irvine, Irvine, CA 92697 USA
[2] Google, Mountain View, CA USA
[3] Yahoo Labs, Sunnyvale, CA USA
[4] Univ Amsterdam, Amsterdam, Netherlands
关键词
Large-Scale; Distributed; Matrix Factorization; MCMC; Stochastic Gradient; Bayesian Inference;
D O I
10.1145/2783258.2783373
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid over fitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gibbs sampling, but at the same time is as fast and simple as stochastic gradient descent. In our experiments, we show that our algorithm can achieve the same level of prediction accuracy as Gibbs sampling an order of magnitude faster. We also show that our method reduces the prediction error as fast as distributed stochastic gradient descent, achieving a 4.1% improvement in RMSE for the Netflix dataset and an 1.8% for the Yahoo music dataset.
引用
收藏
页码:9 / 18
页数:10
相关论文
共 50 条
  • [1] Large-Scale Bayesian Probabilistic Matrix Factorization with Memo-Free Distributed Variational Inference
    Chen, Guangyong
    Zhu, Fengyuan
    Heng, Pheng Ann
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (03)
  • [2] Distributed Stochastic Gradient MCMC
    Ahn, Sungjin
    Shahbaba, Babak
    Welling, Max
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1044 - 1052
  • [3] Parallel stochastic gradient algorithms for large-scale matrix completion
    Recht B.
    Ré C.
    Mathematical Programming Computation, 2013, 5 (2) : 201 - 226
  • [4] Community discovery in large-scale complex networks using distributed SimRank nonnegative matrix factorization
    He, Chaobo
    Fei, Xiang
    Li, Hanchao
    Liu, Hai
    Tang, Yong
    Chen, Qimai
    2017 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2017, : 226 - 231
  • [5] Efficient Large-Scale Similarity Search Using Matrix Factorization
    Iscen, Ahmet
    Rabbat, Michael
    Furon, Teddy
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2073 - 2081
  • [6] MCMC methods for bayesian variable selection in large-scale genomic applications
    Zucknick, Manuela
    Holmes, Chris
    Richardson, Sylvia
    ANNALS OF HUMAN GENETICS, 2007, 71 : 558 - 559
  • [7] Matrix factorization of large scale data using multistage matrix factorization
    Prasad Bhavana
    Vineet Padmanabhan
    Applied Intelligence, 2021, 51 : 4016 - 4028
  • [8] Matrix factorization of large scale data using multistage matrix factorization
    Bhavana, Prasad
    Padmanabhan, Vineet
    APPLIED INTELLIGENCE, 2021, 51 (06) : 4016 - 4028
  • [9] Distributed Bayesian Probabilistic Matrix Factorization
    Aa, Tom Vander
    Chakroun, Imen
    Haber, Tom
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 1030 - 1039
  • [10] Distributed Bayesian Probabilistic Matrix Factorization
    Vander Aa, Tom
    Chakroun, Imen
    Haber, Tom
    2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 346 - 349