A Framework for Scheduling and Managing Big Data Applications in a Distributed Infrastructure

被引:0
|
作者
Govindarajan, Kannan [1 ]
Somasundaram, Thamarai Selvi [2 ]
Boulanger, David [1 ]
Kumar, Vivekanandan Suresh [1 ]
Kinshuk [1 ]
机构
[1] Athabasca Univ, Edmonton, AB, Canada
[2] Anna Univ, Madras, Tamil Nadu, India
关键词
big data; grid computing; cloud computing; cluster computing; software defined networking; distributed processing; Hadoop Distributed File System;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, big data has received attention from researchers, business industries, education, and scientific communities. Big data analytics has to deal with large scale data that consist of both structured and unstructured data. These data are to be handled properly, that is extracting, processing, and analyzing those data to obtain meaningful information from them in a limited time. To yield insightful information, the processing of big data analytics requires high performance computing system, storage, and network resources. Hence, it is essential to design a high performance computing infrastructure with sufficient bandwidth which is capable to handle the big data processing in an efficient manner. However, the current network architectures in those infrastructures, with predefined network policies, do not allow for just-in-time reconfiguration of the networking infrastructure as demanded by big data analytics. In addressing these limitations, Software-Defined Networking (SDN) offers the means to dynamically configure the network parameters, dynamically provision the networks, and the network itself can be sliced in an on-demand manner. This research aims to characterize SDN with respect to the demands of big data analytics in Cluster, Grid, and Cloud Computing resources. The main motivation behind this research study is to design and develop an intelligent framework named as Big Data Analytics Management System (BDAMS) for collectively managing the compute, storage, and network resources in Cluster, Grid, and Cloud infrastructure for big data analytics.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Scheduling in Big Data Heterogeneous Distributed System Using Hadoop
    Thakkar, Shraddha
    Patel, Sanjay
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT ICT4SD 2015, VOL 2, 2016, 409 : 119 - 131
  • [32] Efficient jobs scheduling approach for big data applications
    Shao, Yanling
    Li, Chunlin
    Gu, Jinguang
    Zhang, Jing
    Luo, Youlong
    COMPUTERS & INDUSTRIAL ENGINEERING, 2018, 117 : 249 - 261
  • [33] Cascaded TCP: BIG Throughput for BIG DATA Applications in Distributed HPC
    Kalim, Umar
    Gardner, Mark
    Brown, Eric
    Feng, Wu-chun
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1419 - +
  • [34] Cascaded TCP: BIG Throughput for BIG DATA Applications in Distributed HPC
    Kalim, Umar
    Gardner, Mark
    Brown, Eric
    Feng, Wu-chun
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1421 - 1421
  • [35] Privacy Issues in Big Data Mining Infrastructure, Platforms, and Applications
    Zhang, Xuyun
    Jang-Jaccard, Julian
    Qi, Lianyong
    Bhuiyan, Md Z. A.
    Liu, Chang
    SECURITY AND COMMUNICATION NETWORKS, 2018,
  • [36] Distributed Big Data Driven Framework for Cellular Network Monitoring Data
    Suleykin, Alexander
    Panfilov, Peter
    PROCEEDINGS OF THE 24TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 430 - 436
  • [37] FSBD: A Framework for Scheduling of Big Data Mining in Cloud Computing
    Ismail, Leila
    Masud, Mohammad M.
    Khan, Latifur
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 513 - 520
  • [38] Agile-Ant: Self-managing Distributed Cache Management for Cost Optimization of Big Data Applications
    Al-Sayeh, Hani
    Jibril, Muhammad Attahir
    Sattler, Kai-Uwe
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3151 - 3164
  • [39] PaPar: A Parallel Data Partitioning Framework for Big Data Applications
    Wang, Hao
    Zhang, Jing
    Zhang, Da
    Pumma, Sarunya
    Feng, Wu-chun
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 605 - 614
  • [40] Task scheduling in Distributed Data Mining for medical applications
    Gantenbein, RE
    Sung, CO
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 250 - 253