Software Abstractions for Large-Scale Deep Learning Models in Big Data Analytics

被引:0
|
作者
Khan, Ayaz H. [1 ]
Qamar, Ali Mustafa [2 ]
Yusuf, Aneeq [1 ]
Khan, Rehanullah [2 ]
机构
[1] Karachi Inst Econ & Technol, Coll Comp & Informat Sci, Karachi, Pakistan
[2] Qassim Univ, Coll Comp, Mulaidah, Saudi Arabia
关键词
Big data; deep learning; deep auto-encoders; Restricted Boltzmann Machines (RBM);
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The goal of big data analytics is to analyze datasets with a higher amount of volume, velocity, and variety for large-scale business intelligence problems. These workloads are normally processed with the distribution on massively parallel analytical systems. Deep learning is part of a broader family of machine learning methods based on learning representations of data. Deep learning plays a significant role in the information analysis by adding value to the massive amount of unsupervised data. A core domain of research is related to the development of deep learning algorithms for auto-extraction of complex data formats at a higher level of abstraction using the massive volumes of data. In this paper, we present the latest research trends in the development of parallel algorithms, optimization techniques, tools and libraries related to big data analytics and deep learning on various parallel architectures. The basic building blocks for deep learning such as Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) are identified and analyzed for parallelization of deep learning models. We proposed a parallel software API based on PyTorch, Hadoop Distributed File System (HDFS), Apache Hadoop MapReduce and MapReduce Job (MRJob) for developing large-scale deep learning models. We obtained about 5-30% reduction in the execution time of the deep auto-encoder model even on a single node Hadoop cluster. Furthermore, the complexity of code development is significantly reduced to create multi-layer deep learning models.
引用
收藏
页码:557 / 566
页数:10
相关论文
共 50 条
  • [1] Software abstractions for large-scale deep learning models in big data analytics
    Khan A.H.
    Qamar A.M.
    Yusuf A.
    Khan R.
    International Journal of Advanced Computer Science and Applications, 2019, 10 (04): : 557 - 566
  • [2] Big Data Analytics on Large-Scale Socio-technical Software Engineering Archives
    Bayati, Shahabedin
    Parsons, David
    Susnjak, Teo
    Heidary, Marzieh
    2015 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT), 2015, : 65 - 69
  • [3] Big Data for Enhanced Learning Analytics: A Case for Large-Scale Comparative Assessments
    Korfiatis, Nikolaos
    METADATA AND SEMANTICS RESEARCH, MTSR 2013, 2013, 390 : 225 - 233
  • [4] Performance Evaluation of Big Data Frameworks for Large-Scale Data Analytics
    Veiga, Jorge
    Exposito, Roberto R.
    Pardo, Xoan C.
    Taboada, Guillermo L.
    Tourino, Juan
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 424 - 431
  • [5] Optimizing Apache Spark MLlib: Predictive Performance of Large-Scale Models for Big Data Analytics
    Theodorakopoulos, Leonidas
    Karras, Aristeidis
    Krimpas, George A.
    ALGORITHMS, 2025, 18 (02)
  • [6] Deep Learning for Big Data Analytics
    Bathla, Gourav
    Aggarwal, Himanshu
    Rani, Rinkle
    ADVANCES IN COMPUTING AND INTELLIGENT SYSTEMS, ICACM 2019, 2020, : 391 - 399
  • [7] Distributed optimization over large-scale systems for big data analytics
    Shahbazian, Reza
    4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2021, 19 (02): : 309 - 310
  • [8] Distributed optimization over large-scale systems for big data analytics
    Reza Shahbazian
    4OR, 2021, 19 : 309 - 310
  • [9] Big Data Analytics for Large-scale Wireless Networks: Challenges and Opportunities
    Dai, Hong-Ning
    Wong, Raymond Chi-Wing
    Wang, Hao
    Zheng, Zibin
    Vasilakos, Athanasios V.
    ACM COMPUTING SURVEYS, 2019, 52 (05)
  • [10] BANKSAFE: Visual analytics for big data in large-scale computer networks
    Fischer, Fabian
    Fuchs, Johannes
    Mansmann, Florian
    Keim, Daniel A.
    INFORMATION VISUALIZATION, 2015, 14 (01) : 51 - 61