Clustering-based data placement in cloud computing: a predictive approach

被引:0
|
作者
Mokhtar Sellami
Haithem Mezni
Mohand Said Hacid
Mohamed Moshen Gammoudi
机构
[1] University of Jendouba,
[2] Taibah University,undefined
[3] SMART Lab,undefined
[4] ISG de Tunis,undefined
[5] Univ. Lyon,undefined
[6] University Claude Bernard Lyon 1,undefined
[7] LIRIS,undefined
[8] Higher Institute of Multimedia Arts of Manouba,undefined
[9] RIADI,undefined
来源
Cluster Computing | 2021年 / 24卷
关键词
Data placement; Resource usage; Intensive jobs; Prediction; Kernel Density Estimation; Fuzzy FCA; SOA; Autonomic computing;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, cloud computing environments have become a natural choice to host and process a huge volume of data. The combination of cloud computing and big data frameworks is an effective way to run data-intensive applications and tasks. Also, an optimal arrangement of data partitions can improve the tasks executions, which is not the case in most big data frameworks. For example, the default distribution of data partitions in Hadoop-based clouds causes several problems, which are mainly related to the load balancing and the resource usage. In addition, most existing data placement solutions are static and lack precision in the placement of data partitions. To overcome these issues, we propose a data placement approach based on the prediction of the future resources usage. We exploit Kernel Density Estimation (KDE) and Fuzzy FCA techniques to, first, forecast the workers’ and tasks’ future resource consumption and, second, cluster data partitions and intensive jobs according to the estimated resource usage. Fuzzy FCA is also used to exclude partitions and jobs that require less resources, which will reduce the needless migrations. To allow monitoring and predicting the workers’ states and the data partitions’ consumption, we modeled the big data cluster as an autonomic service-based system. The obtained results have shown that our solution outperformed existing approaches in terms of migrations rate and resource consumption.
引用
收藏
页码:3311 / 3336
页数:25
相关论文
共 50 条
  • [21] A Deep Clustering-based Novel Approach for Binning of Metagenomics Data
    Madival, Sharanbasappa D.
    Mishra, Dwijesh Chandra
    Sharma, Anu
    Kumar, Sanjeev
    Maji, Arpan Kumar
    Budhlakoti, Neeraj
    Sinha, Dipro
    Rai, Anil
    CURRENT GENOMICS, 2022, 23 (05) : 353 - 368
  • [22] Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation
    Dubey, Aditya
    Rasool, Akhtar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 710 - 714
  • [23] A New Data Placement Approach for Scientific Workflows in Cloud Computing Environments
    Kchaou, Hamdi
    Kechaou, Zied
    Alimi, Adel M.
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA 2016), 2017, 557 : 330 - 340
  • [24] Resource Discovery in Mobile Cloud Computing: A Clustering Based Approach
    Athwani, Priyanka
    Vidyarthi, Deo Prakash
    2015 IEEE UP SECTION CONFERENCE ON ELECTRICAL COMPUTER AND ELECTRONICS (UPCON), 2015,
  • [25] Detecting Data Accuracy Issues in Textual Geographical Data by a Clustering-based Approach
    Pellegrino, Maria Angela
    Postiglione, Luca
    Scarano, Vittorio
    CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 208 - 212
  • [26] A Method of Reliability Assessment Based on Hazard Rate by Clustering Approach for Cloud Computing with Big Data
    Tamura, Yoshinobu
    Nobukawa, Yumi
    Yamada, Shigeru
    2015 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM), 2015, : 732 - 736
  • [27] Clustering-based device-to-device cache placement
    Kazez, Ahmet Cihat
    Girici, Toga
    AD HOC NETWORKS, 2019, 84 : 170 - 177
  • [28] A data placement strategy for big data based on DCC in cloud computing systems
    Wang, Tao
    Yao, Shihong
    Xu, Zhengquan
    Jia, Shan
    Xu, Qiang
    2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 623 - 630
  • [29] Markov Clustering-Based Placement Algorithm for Hierarchical FPGAs
    戴晖
    周强
    边计年
    TsinghuaScienceandTechnology, 2011, 16 (01) : 62 - 68
  • [30] Conference scheduling: A clustering-based approach
    Bulhoes, Teobaldo
    Correia, Rubens
    Subramanian, Anand
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 297 (01) : 15 - 26