Asynchronous Distributed ADMM for Learning with Large-Scale and High-Dimensional Sparse Data Set

被引:2
|
作者
Wang, Dongxia [1 ]
Lei, Yongmei [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, 333 Nanchen Rd, Shanghai 200436, Peoples R China
基金
中国国家自然科学基金;
关键词
GA-ADMM; General form consensus; Bounded asynchronous; Non-convex;
D O I
10.1007/978-3-030-36405-2_27
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The distributed alternating direction method of multipliers is an effective method to solve large-scale machine learning. At present, most distributed ADMM algorithms need to transfer the entire model parameter in the communication, which leads to high communication cost, especially when the features of model parameter is very large. In this paper, an asynchronous distributed ADMM algorithm (GA-ADMM) based on general form consensus is proposed. First, the GA-ADMM algorithm filters the information transmitted between nodes by analyzing the characteristics of high-dimensional sparse data set: only associated features, rather than all features of the model, need to be transmitted between workers and the master, thus greatly reducing the communication cost. Second, the bounded asynchronous communication protocol is used to further improve the performance of the algorithm. The convergence of the algorithm is also analyzed theoretically when the objective function is non-convex. Finally, the algorithm is tested on the cluster supercomputer "Ziqiang 4000". The experiments show that the GA-ADMM algorithm converges when appropriate parameters are selected, the GA-ADMM algorithm requires less system time to reach convergence than the AD-ADMM algorithm, and the accuracy of these two algorithms is approximate.
引用
收藏
页码:259 / 274
页数:16
相关论文
共 50 条
  • [41] Visualizing large-scale high-dimensional data via hierarchical embedding of KNN graphs
    Zhu, Haiyang
    Zhu, Minfeng
    Feng, Yingchaojie
    Cai, Deng
    Hu, Yuanzhe
    Wu, Shilong
    Wu, Xiangyang
    Chen, Wei
    VISUAL INFORMATICS, 2021, 5 (02) : 51 - 59
  • [42] Monitoring high-dimensional data for failure detection and localization in large-scale computing systems
    Chen, Haifeng
    Jiang, Guofei
    Yoshihira, Kenji
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (01) : 13 - 25
  • [43] Large-Scale Online Feature Selection for Ultra-High Dimensional Sparse Data
    Wu, Yue
    Hoi, Steven C. H.
    Mei, Tao
    Yu, Nenghai
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2017, 11 (04)
  • [44] AliGater: a framework for the development of bioinformatic pipelines for large-scale, high-dimensional cytometry data
    Ekdahl, Ludvig
    Arrizabalaga, Antton Lamarca
    Ali, Zain
    Cafaro, Caterina
    de Lapuente Portilla, Aitzkoa Lopez
    Nilsson, Bjorn
    NEURO-ONCOLOGY ADVANCES, 2023, 5 (01)
  • [45] Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data
    Leclercq, Mickael
    Vittrant, Benjamin
    Martin-Magniette, Marie Laure
    Boyer, Marie Pier Scott
    Perin, Olivier
    Bergeron, Alain
    Fradet, Yves
    Droit, Arnaud
    FRONTIERS IN GENETICS, 2019, 10
  • [46] Grid-based indexing and search algorithms for large-scale and high-dimensional data
    Yang, Chuanfu
    Li, Zhiyang
    Qu, Wenyu
    Liu, Zhaobin
    Qi, Heng
    2017 14TH INTERNATIONAL SYMPOSIUM ON PERVASIVE SYSTEMS, ALGORITHMS AND NETWORKS & 2017 11TH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY & 2017 THIRD INTERNATIONAL SYMPOSIUM OF CREATIVE COMPUTING (ISPAN-FCST-ISCC), 2017, : 46 - 51
  • [47] Efficient ML Lifecycle Transferring for Large-Scale and High-Dimensional Data via Core Set-Based Dataset Similarity
    Le, Van-Duc
    Bui, Tien-Cuong
    Li, Wen-Syan
    IEEE ACCESS, 2023, 11 : 73823 - 73838
  • [48] Batched Large-scale Bayesian Optimization in High-dimensional Spaces
    Wang, Zi
    Gehring, Clement
    Kohli, Pushmeet
    Jegelka, Stefanie
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [49] Parallel algorithms for clustering high-dimensional large-scale datasets
    Nagesh, H
    Goil, S
    Choudhary, A
    DATA MINING FOR SCIENTIFIC AND ENGINEERING APPLICATIONS, 2001, 2 : 335 - 356
  • [50] On the anonymization of sparse high-dimensional data
    Ghinita, Gabriel
    Tao, Yufei
    Kalnis, Panos
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 715 - +