Clustering-based data placement in cloud computing: a predictive approach

被引:0
|
作者
Mokhtar Sellami
Haithem Mezni
Mohand Said Hacid
Mohamed Moshen Gammoudi
机构
[1] University of Jendouba,
[2] Taibah University,undefined
[3] SMART Lab,undefined
[4] ISG de Tunis,undefined
[5] Univ. Lyon,undefined
[6] University Claude Bernard Lyon 1,undefined
[7] LIRIS,undefined
[8] Higher Institute of Multimedia Arts of Manouba,undefined
[9] RIADI,undefined
来源
Cluster Computing | 2021年 / 24卷
关键词
Data placement; Resource usage; Intensive jobs; Prediction; Kernel Density Estimation; Fuzzy FCA; SOA; Autonomic computing;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, cloud computing environments have become a natural choice to host and process a huge volume of data. The combination of cloud computing and big data frameworks is an effective way to run data-intensive applications and tasks. Also, an optimal arrangement of data partitions can improve the tasks executions, which is not the case in most big data frameworks. For example, the default distribution of data partitions in Hadoop-based clouds causes several problems, which are mainly related to the load balancing and the resource usage. In addition, most existing data placement solutions are static and lack precision in the placement of data partitions. To overcome these issues, we propose a data placement approach based on the prediction of the future resources usage. We exploit Kernel Density Estimation (KDE) and Fuzzy FCA techniques to, first, forecast the workers’ and tasks’ future resource consumption and, second, cluster data partitions and intensive jobs according to the estimated resource usage. Fuzzy FCA is also used to exclude partitions and jobs that require less resources, which will reduce the needless migrations. To allow monitoring and predicting the workers’ states and the data partitions’ consumption, we modeled the big data cluster as an autonomic service-based system. The obtained results have shown that our solution outperformed existing approaches in terms of migrations rate and resource consumption.
引用
收藏
页码:3311 / 3336
页数:25
相关论文
共 50 条
  • [1] Clustering-based data placement in cloud computing: a predictive approach
    Sellami, Mokhtar
    Mezni, Haithem
    Hacid, Mohand Said
    Gammoudi, Mohamed Moshen
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2021, 24 (04): : 3311 - 3336
  • [2] A data placement strategy based on clustering and consistent hashing algorithm in Cloud Computing
    Li, Qiang
    Wang, Kun
    Wei, Suwei
    Han, Xuefeng
    Xu, Lili
    Gao, Min
    2014 9TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND NETWORKING IN CHINA (CHINACOM), 2014, : 478 - 483
  • [3] Clustering-based approach for medical data classification
    Kodabagi, Mallikarjun M.
    Tikotikar, Ahelam
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
  • [4] Clustering based virtual machines placement in distributed cloud computing
    Zhang, Jiangtao
    Wang, Xuan
    Huang, Hejiao
    Chen, Shi
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 66 : 1 - 10
  • [5] Fine granularity clustering-based placement
    Hu, B
    Marek-Sadowska, M
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2004, 23 (04) : 527 - 536
  • [6] Clustering-Based Predictive Analytics to Improve Scientific Data Discovery
    Devarakonda, Ranjeet
    Kumar, Jitendra
    Prakash, Giri
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5658 - 5661
  • [7] A spectral clustering-based optimal deployment method for scientific application in cloud computing
    Fan, Pei
    Wang, Ji
    Chen, Zhenbang
    Zheng, Zibin
    Lyu, Michael R.
    INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2012, 8 (01) : 31 - 55
  • [8] A Genetics Clustering-based Approach for Weblog Data Cleaning
    Ganibardi, Amine
    Ali, Cherif Arab
    2018 SIXTH INTERNATIONAL CONFERENCE ON ENTERPRISE SYSTEMS (ES 2018), 2018, : 75 - 81
  • [9] Graph clustering-based discretization approach to microarray data
    Kittakorn Sriwanna
    Tossapon Boongoen
    Natthakan Iam-On
    Knowledge and Information Systems, 2019, 60 : 879 - 906
  • [10] Clustering-based and consistent hashing-aware data placement algorithm
    Chen T.
    Xiao N.
    Liu F.
    Fu C.-S.
    Ruan Jian Xue Bao/Journal of Software, 2010, 21 (12): : 3175 - 3185