Network Support for High-Performance Distributed Machine Learning

被引:6
|
作者
Malandrino, Francesco [1 ,2 ]
Chiasserini, Carla Fabiana [1 ,3 ]
Molner, Nuria [4 ,5 ]
de la Oliva, Antonio [6 ]
机构
[1] CNR, IEIIT, I-10129 Turin, Italy
[2] CNIT, I-43124 Parma, Italy
[3] Politecn Torino, Dept Elect & Telecommun, I-10129 Turin, Italy
[4] Univ Carlos III Madrid, IMDEA Networks Inst, Madrid 28903, Spain
[5] Univ Politecn Valencia iTEAM UPV, Inst Univ Telecomunicac & Aplicac Multimedia, Valencia 46022, Spain
[6] Univ Carlos III Madrid, Dept Telemat Engn, Madrid 28903, Spain
关键词
Task analysis; Topology; Network topology; Data models; Costs; Machine learning; Training; Network orchestration; machine learning; edge computing; EDGE;
D O I
10.1109/TNET.2022.3189077
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The traditional approach to distributed machine learning is to adapt learning algorithms to the network, e.g., reducing updates to curb overhead. Networks based on intelligent edge, instead, make it possible to follow the opposite approach, i.e., to define the logical network topology around the learning task to perform, so as to meet the desired learning performance. In this paper, we propose a system model that captures such aspects in the context of supervised machine learning, accounting for both learning nodes (that perform computations) and information nodes (that provide data). We then formulate the problem of selecting (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of epochs to run, in order to minimize the learning cost while meeting the target prediction error and execution time. After proving important properties of the above problem, we devise an algorithm, named DoubleClimb, that can find a 1 + 1/vertical bar I vertical bar-competitive solution (with I being the set of information nodes), with cubic worst-case complexity. Our performance evaluation, leveraging a real-world network topology and considering both classification and regression tasks, also shows that DoubleClimb closely matches the optimum, outperforming state-of-the-art alternatives.
引用
收藏
页码:264 / 278
页数:15
相关论文
共 50 条
  • [21] High-Performance Visual Tracking With Extreme Learning Machine Framework
    Deng, Chenwei
    Han, Yuqi
    Zhao, Baojun
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2781 - 2792
  • [22] Machine Learning with Graphs in High-Performance Computing Environments (MLGHPCE)
    Lim, Seung-Hwan
    Schuman, Catherine D.
    Vuduc, Richard
    Moreira, Jose
    ACM International Conference Proceeding Series, 2023,
  • [23] HIGH-PERFORMANCE DISTRIBUTED COMPUTING
    RAGHAVENDRA, CS
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1994, 6 (04): : 231 - 233
  • [24] PCBN - A HIGH-PERFORMANCE PARTITIONABLE CIRCULAR BUS NETWORK FOR DISTRIBUTED SYSTEMS
    WOO, TK
    SU, SYW
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1993, 4 (12) : 1298 - 1307
  • [25] High-Performance Raman Distributed Temperature Sensing Powered by Deep Learning
    Zhang, Zhongshu
    Wu, Hao
    Zhao, Can
    Tang, Ming
    JOURNAL OF LIGHTWAVE TECHNOLOGY, 2021, 39 (02) : 654 - 659
  • [26] HPDL: Towards a General Framework for High-performance Distributed Deep Learning
    Li, Dongsheng
    Lai, Zhiquan
    Ge, Keshi
    Zhang, Yiming
    Zhang, Zhaoning
    Sun, Tao
    Wang, Qinglin
    Wang, Huaimin
    2019 39TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2019), 2019, : 1742 - 1753
  • [27] On the Limits of High-Performance Support
    Wagle, John P.
    Cunanan, Aaron J.
    Sams, Matt L.
    Driggers, Austin R.
    STRENGTH AND CONDITIONING JOURNAL, 2024, 46 (01) : 69 - 73
  • [28] The Role of Network Topology for Distributed Machine Learning
    Neglia, Giovanni
    Calbi, Gianmarco
    Towsley, Don
    Vardoyan, Gayane
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2019), 2019, : 2350 - 2358
  • [29] Silas: A high-performance machine learning foundation for logical reasoning and verification
    Bride, Hadrien
    Cai, Cheng-Hao
    Dong, Jie
    Dong, Jin Song
    Hóu, Zhé
    Mirjalili, Seyedali
    Sun, Jing
    Expert Systems with Applications, 2021, 176
  • [30] Harnessing machine learning for the rational design of high-performance fluorescent dyes
    Ahmad, Nafees
    Eid, Ghada
    El-Toony, Mohamed M.
    Mahmood, Asif
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2025, 334