Network Support for High-Performance Distributed Machine Learning

被引：6

作者：

Malandrino, Francesco ^{[1
,2
]}

Chiasserini, Carla Fabiana ^{[1
,3
]}

Molner, Nuria ^{[4
,5
]}

de la Oliva, Antonio ^{[6
]}

机构：

[1] CNR, IEIIT, I-10129 Turin, Italy

[2] CNIT, I-43124 Parma, Italy

[3] Politecn Torino, Dept Elect & Telecommun, I-10129 Turin, Italy

[4] Univ Carlos III Madrid, IMDEA Networks Inst, Madrid 28903, Spain

[5] Univ Politecn Valencia iTEAM UPV, Inst Univ Telecomunicac & Aplicac Multimedia, Valencia 46022, Spain

[6] Univ Carlos III Madrid, Dept Telemat Engn, Madrid 28903, Spain

来源：

IEEE-ACM TRANSACTIONS ON NETWORKING | 2023年 / 31卷 / 01期

关键词：

Task analysis; Topology; Network topology; Data models; Costs; Machine learning; Training; Network orchestration; machine learning; edge computing; EDGE;

D O I：

10.1109/TNET.2022.3189077

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The traditional approach to distributed machine learning is to adapt learning algorithms to the network, e.g., reducing updates to curb overhead. Networks based on intelligent edge, instead, make it possible to follow the opposite approach, i.e., to define the logical network topology around the learning task to perform, so as to meet the desired learning performance. In this paper, we propose a system model that captures such aspects in the context of supervised machine learning, accounting for both learning nodes (that perform computations) and information nodes (that provide data). We then formulate the problem of selecting (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of epochs to run, in order to minimize the learning cost while meeting the target prediction error and execution time. After proving important properties of the above problem, we devise an algorithm, named DoubleClimb, that can find a 1 + 1/vertical bar I vertical bar-competitive solution (with I being the set of information nodes), with cubic worst-case complexity. Our performance evaluation, leveraging a real-world network topology and considering both classification and regression tasks, also shows that DoubleClimb closely matches the optimum, outperforming state-of-the-art alternatives.

引用

页码：264 / 278

页数：15

共 50 条

[41] Applying machine learning for high-performance named-entity extraction
Baluja, S
Mittal, VO
Sukthankar, R
COMPUTATIONAL INTELLIGENCE, 2000, 16 (04) : 586 - 595
[42] A comprehensive machine learning strategy for designing high-performance photoanode catalysts
Huang, Meirong
Wang, Sutong
Zhu, Hongwei
JOURNAL OF MATERIALS CHEMISTRY A, 2023, 11 (40) : 21619 - 21627
[43] High-performance computing and machine learning applied in thermal systems analysis
Mostafa Safdari Shadloo
Amin Rahmat
Larry K. B. Li
Omid Mahian
Avinash Alagumalai
Journal of Thermal Analysis and Calorimetry, 2021, 145 : 1733 - 1737
[44] Novel, high-performance machine learning model for detection of subclinical keratoconus
Cao, Ke
Verspoor, Karin
Chan, Elsie
Daniell, Mark
Sahebjada, Srujana
Baird, Paul N.
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2021, 62 (08)
[45] Distributed high-performance web crawler based on peer-to-peer network
Fei, L
Ma, FY
Ye, YM
Li, ML
Yu, JD
PARALLEL AND DISTRIBUTED COMPUTING: APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2004, 3320 : 50 - 53
[46] Resilient Integration of Distributed High-Performance Zones into the BelWue Network Using OpenFlow
Menth, Michael
Schmidt, Mark
Reutter, Daniel
Finze, Robert
Neuner, Sebastian
Kleefass, Tim
IEEE COMMUNICATIONS MAGAZINE, 2017, 55 (04) : 94 - 99
[47] NetANNS: A High-Performance Distributed Search Framework Based On In-Network Computing
Zhang, Penghao
Pan, Heng
Li, Zhenyu
Xie, Gaogang
Cui, Penglai
19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 271 - 278
[48] Martini: a network interface controller chip for high-performance computing with distributed PCs
Watanabe, Konosuke
Otsuka, Tomohiro
Tsuchiya, Junichiro
Nishi, Hiroaki
Yamamoto, Junji
Tanabe, Noboru
Kudoh, Tomohiro
Amano, Hideharu
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2007, 18 (09) : 1282 - 1295
[49] High-Performance Wireless Piezoelectric Sensor Network for Distributed Structural Health Monitoring
Gao, Shang
Dai, Xuewu
Liu, Zheng
Tian, Guiyun
INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2016,
[50] Distributed Extreme Learning Machine for Nonlinear Learning over Network
Huang, Songyan
Li, Chunguang
ENTROPY, 2015, 17 (02) : 818 - 840

← 1 2 3 4 5 →