Strategies and Principles of Distributed Machine Learning on Big Data

被引:99
|
作者
Xing, Eric P. [1 ]
Ho, Qirong [1 ]
Xie, Pengtao [1 ]
Wei, Dai [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
Machine learning; Artificial intelligence big data; Big model; Distributed systems; Principles; Theory; Data-parallelism; Model-parallelism; REGRESSION; MODEL; SELECTION;
D O I
10.1016/J.ENG.2016.02.008
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The rise of big data has led to new demands for machine learning (ML) systems to learn complex models, with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as high-dimensional latent features, intermediate representations, and decision functions) thereupon. In order to run ML algorithms at such scales, on a distributed cluster with tens to thousands of machines, it is often the case that significant engineering efforts are required-and one might fairly ask whether such engineering truly falls within the domain of ML research. Taking the view that "big" ML systems can benefit greatly from ML-rooted statistical and algorithmic insights-and that ML researchers should therefore not shy away from such systems design-we discuss a series of principles and strategies distilled from our recent efforts on industrial-scale ML solutions. These principles and strategies span a continuum from application, to engineering, and to theoretical research and development of big ML systems and architectures, with the goal of understanding how to make them efficient, generally applicable, and supported with convergence and scaling guarantees. They concern four key questions that traditionally receive little attention in ML research: How can an ML program be distributed over a cluster? How can ML computation be bridged with inter-machine communication? How can such communication be performed? What should be communicated between machines? By exposing underlying statistical and algorithmic characteristics unique to ML programs but not typically seen in traditional computer programs, and by dissecting successful cases to reveal how we have harnessed these principles to design and develop both high-performance distributed ML software as well as general-purpose ML frameworks, we present opportunities for ML researchers and practitioners to further shape and enlarge the area that lies between ML and systems.. (C) 2016 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and Higher Education Press Limited Company. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:179 / 195
页数:17
相关论文
共 50 条
  • [11] Machine Learning in Big Data
    Wang, Lidong
    Alexander, Cheryl Ann
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2016, 1 (02) : 52 - 61
  • [12] Machine Learning on Big Data
    Condie, Tyson
    Mineiro, Paul
    Polyzotis, Neoklis
    Weimer, Markus
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1242 - 1244
  • [13] Dynamic Distributed and Parallel Machine Learning algorithms for big data mining processing
    Djafri, Laouni
    DATA TECHNOLOGIES AND APPLICATIONS, 2022, 56 (04) : 558 - 601
  • [14] Small Data, Big Challenges: Pitfalls and Strategies for Machine Learning in Fatigue Detection
    Jeworutzki, Andre
    Schwarzer, Jan
    von Luck, Kai
    Stelldinger, Peer
    Draheim, Susanne
    Wang, Qi
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 364 - 373
  • [15] Quality assurance strategies for machine learning applications in big data analytics: an overview
    Ogrizovic, Mihajlo
    Draskovic, Drazen
    Bojic, Dragan
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [16] Machine learning for big data analytics
    Oja, E. (erkki.oja@aalto.fi), 1600, Springer Verlag (384):
  • [17] Big data and machine learning in health
    Carvalho, D.
    Cruz, R.
    EUROPEAN JOURNAL OF PUBLIC HEALTH, 2020, 30 : 10 - 11
  • [18] Machine learning and big scientific data
    Hey, Tony
    Butler, Keith
    Jackson, Sam
    Thiyagalingam, Jeyarajan
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2020, 378 (2166):
  • [19] Machine Learning under Big Data
    Shi, Chunhe
    Wu, Chengdong
    Han, Xiaowei
    Xie, Yinghong
    Li, Zhen
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ELECTRONIC, MECHANICAL, INFORMATION AND MANAGEMENT SOCIETY (EMIM), 2016, 40 : 301 - 305
  • [20] Machine learning, big data, and neuroscience
    Pillow, Jonathan
    Sahani, Maneesh
    CURRENT OPINION IN NEUROBIOLOGY, 2019, 55 : III - IV