TRAINING SAMPLE SELECTION FOR DEEP LEARNING OF DISTRIBUTED DATA

被引:0
|
作者
Jiang, Zheng [1 ]
Zhu, Xiaoqing [1 ]
Tan, Wai-tian [1 ]
Liston, Rob [1 ]
机构
[1] Cisco Syst, Chief Technol & Architecture Off, San Jose, CA 95134 USA
关键词
Deep neural networks; training sample selection; bandwidth-constrained learning;
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
The success of deep learning in the form of multi-layer neural networks depends critically on the volume and variety of training data. Its potential is greatly compromised when training data originate in a geographically distributed manner and are subject to bandwidth constraints. This paper presents a data sampling approach to deep learning, by carefully discriminating locally available training samples based on their relative importance. Towards this end, we propose two metrics for prioritizing candidate training samples as functions of their test trial outcome: correctness and confidence. Bandwidth-constrained simulations show significant performance gain of our proposed training sample selection schemes over convention uniform sampling: up to 15 x bandwidth reduction for the MNIST dataset and 25% reduction in learning time for the CIFAR-10 dataset.
引用
收藏
页码:2189 / 2193
页数:5
相关论文
共 50 条
  • [21] Sample-level Data Selection for Federated Learning
    Li, Anran
    Zhang, Lan
    Tan, Juntao
    Qin, Yaxuan
    Wang, Junhao
    Li, Xiang-Yang
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [22] Augmentation and Evaluation of Training Data for Deep Learning
    Ding, Junhua
    Li, XinChuan
    Gudivada, Venkat N.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2603 - 2611
  • [23] Training data reduction for deep learning-based image classifications using random sample consensus
    Jung, Heechul
    Ju, Jeongwoo
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (01)
  • [24] Modeling and Optimizing the Scaling Performance in Distributed Deep Learning Training
    Liu, Ting
    Miao, Tianhao
    Wu, Qinghua
    Li, Zhenyu
    He, Guangxin
    Wu, Jiaoren
    Zhang, Shengzhuo
    Yang, Xingwu
    Tyson, Gareth
    Xie, Gaogang
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 1764 - 1773
  • [25] Dynamic Stale Synchronous Parallel Distributed Training for Deep Learning
    Zhao, Xing
    An, Aijun
    Liu, Junfeng
    Chen, Bao Xin
    2019 39TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2019), 2019, : 1507 - 1517
  • [26] SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
    Khan, Redwan Ibne Seraj
    Yazdani, Ahmad Hossein
    Fu, Yuqi
    Paul, Arnab K.
    Ji, Bo
    Jian, Xun
    Cheng, Yue
    Butt, Ali R.
    PROCEEDINGS OF THE 21ST USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, FAST 2023, 2023, : 135 - 151
  • [27] Collective Communication Performance Evaluation for Distributed Deep Learning Training
    Lee, Sookwang
    Lee, Jaehwan
    APPLIED SCIENCES-BASEL, 2024, 14 (12):
  • [28] ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning
    Gu, Diandian
    Zhao, Yihao
    Zhong, Yinmin
    Xiong, Yifan
    Han, Zhenhua
    Cheng, Peng
    Yang, Fan
    Huang, Gang
    Jin, Xin
    Liu, Xuanzhe
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, VOL 2, ASPLOS 2023, 2023, : 266 - 280
  • [29] HIERARCHICAL TRAINING FOR DISTRIBUTED DEEP LEARNING BASED ON MULTIMEDIA DATA OVER BAND-LIMITED NETWORKS
    Qi, Siyu
    Chamain, Lahiru D.
    Ding, Zhi
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2871 - 2875
  • [30] Compressed Collective Sparse-Sketch for Distributed Data-Parallel Training of Deep Learning Models
    Ge, Keshi
    Lu, Kai
    Fu, Yongquan
    Deng, Xiaoge
    Lai, Zhiquan
    Li, Dongsheng
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2023, 41 (04) : 941 - 963