Software Abstractions for Large-Scale Deep Learning Models in Big Data Analytics

被引:0
|
作者
Khan, Ayaz H. [1 ]
Qamar, Ali Mustafa [2 ]
Yusuf, Aneeq [1 ]
Khan, Rehanullah [2 ]
机构
[1] Karachi Inst Econ & Technol, Coll Comp & Informat Sci, Karachi, Pakistan
[2] Qassim Univ, Coll Comp, Mulaidah, Saudi Arabia
关键词
Big data; deep learning; deep auto-encoders; Restricted Boltzmann Machines (RBM);
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The goal of big data analytics is to analyze datasets with a higher amount of volume, velocity, and variety for large-scale business intelligence problems. These workloads are normally processed with the distribution on massively parallel analytical systems. Deep learning is part of a broader family of machine learning methods based on learning representations of data. Deep learning plays a significant role in the information analysis by adding value to the massive amount of unsupervised data. A core domain of research is related to the development of deep learning algorithms for auto-extraction of complex data formats at a higher level of abstraction using the massive volumes of data. In this paper, we present the latest research trends in the development of parallel algorithms, optimization techniques, tools and libraries related to big data analytics and deep learning on various parallel architectures. The basic building blocks for deep learning such as Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) are identified and analyzed for parallelization of deep learning models. We proposed a parallel software API based on PyTorch, Hadoop Distributed File System (HDFS), Apache Hadoop MapReduce and MapReduce Job (MRJob) for developing large-scale deep learning models. We obtained about 5-30% reduction in the execution time of the deep auto-encoder model even on a single node Hadoop cluster. Furthermore, the complexity of code development is significantly reduced to create multi-layer deep learning models.
引用
收藏
页码:557 / 566
页数:10
相关论文
共 50 条
  • [41] Deep learning applications and challenges in big data analytics
    Najafabadi M.M.
    Villanustre F.
    Khoshgoftaar T.M.
    Seliya N.
    Wald R.
    Muharemagic E.
    Journal of Big Data, 2 (1)
  • [42] Deep learning in big data Analytics: A comparative study
    Jan, Bilal
    Farman, Haleem
    Khan, Murad
    Imran, Muhammad
    Ul Islam, Ihtesham
    Ahmad, Awais
    Ali, Shaukat
    Jeon, Gwanggil
    COMPUTERS & ELECTRICAL ENGINEERING, 2019, 75 : 275 - 287
  • [43] Big R: Large-scale Analytics on Hadoop using R
    Lara, Oscar D.
    Zhuang, Weiqiang
    Pannu, Adarsh
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 569 - 576
  • [44] Deep learning for intelligent systems and big data analytics
    Agarwal, Basant
    Recent Patents on Engineering, 2020, 14 (03) : 392 - 393
  • [45] Deep Incremental Learning for Big Data Stream Analytics
    Alex, Suja A.
    Nayahi, J. Jesu Vedha
    PROCEEDING OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS, BIG DATA AND IOT (ICCBI-2018), 2020, 31 : 600 - 614
  • [46] MDP Abstractions from Data: Large-Scale Stochastic Networks
    Lavaei, Abolfazl
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 6058 - 6063
  • [47] Node Attributes and Edge Structure for Large-Scale Big Data Network Analytics and Community Detection
    Chopade, Pravin
    Zhan, Justin
    Bikdash, Marwan
    2015 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGIES FOR HOMELAND SECURITY (HST), 2015,
  • [48] Node attributes and edge structure for large-scale big data network analytics and community detection
    Department of Computer Science and CSE, North Carolina AandT State University, Greensboro
    NC, United States
    IEEE Int. Symp. Technol. Homel. Secur., HST, 2015,
  • [49] A Comparison of Svm With Deep Learning Models for Large-Scale Intents Analysis
    Islamic, Toqeer Ali
    Jan, Salman
    Faizullah, Safiullah
    Musa, Shahrulniza
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (07): : 38 - 46
  • [50] Large-scale Deep Learning at Baidu
    Yu, Kai
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2211 - 2211