Fast Big Data Analysis in Geo-Distributed Cloud

被引:2
|
作者
Li, Yue [1 ]
Zhao, Laiping [2 ]
Cui, Chenzhou [3 ]
Yu, Ce [1 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Tianjin Univ, Sch Comp Software, Tianjin, Peoples R China
[3] CAS NAOC, Natl Astron Observ, Beijing, Peoples R China
关键词
D O I
10.1109/CLUSTER.2016.28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As cloud services grow to span more and more globally distributed datacenters, there is an increasingly need for scheduling algorithms to automatically place tasks across these datacenters. In geo-distributed cloud, the limited WAN bandwidth has become the major bottleneck in fast big data analytics. The scheduling algorithm needs to minimize the global completion time, by jointly optimizing task scheduling and WAN data transfer. In this paper, we model the task scheduling as a community detection problem, with respect to the dependency relations between task, data, and datacenters, and propose a Community Detection-based Scheduling (CDS) algorithm, which is able to minimize the WAN data transfer volume. We utilize the real China-Astronomy-Cloud network to evaluate the proposed algorithms. Experimental results show that we can reduce the total data transfer volume by up to 40.7%, and the global completion time by up to 35.8%, compared with the Hypergraph Partition-based scheduling algorithm and the greedy scheduling algorithm.
引用
收藏
页码:388 / 391
页数:4
相关论文
共 50 条
  • [22] Multi-job Hadoop scheduling to process Geo-distributed big data
    Cavallo, Marco
    Di Modica, Giuseppe
    Polito, Carmelo
    Tomarchio, Orazio
    2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2017, : 1175 - 1181
  • [23] Privacy Regulation Aware Process Mapping in Geo-Distributed Cloud Data Centers
    Zhou, Amelie Chi
    Xiao, Yao
    Gong, Yifan
    He, Bingsheng
    Zhai, Jidong
    Mao, Rui
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (08) : 1872 - 1888
  • [24] Cost-Aware Big Data Processing Across Geo-Distributed Datacenters
    Xiao, Wenhua
    Bao, Weidong
    Zhu, Xiaomin
    Liu, Ling
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (11) : 3114 - 3127
  • [25] A Hierarchical Hadoop Framework to Handle Big Data in Geo-Distributed Computing Environments
    Tomarchio, Orazio
    Di Modica, Giuseppe
    Cavallo, Marco
    Polito, Carmelo
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2018, 11 (01) : 16 - 47
  • [26] Performance sensitive replication in geo-distributed cloud datastores
    Shankaranarayanan, P. N.
    Sivakumar, Ashiwan
    Rao, Sanjay
    Tawarmalani, Mohit
    2014 44TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2014, : 240 - 251
  • [27] Energy-Aware Cloud Workflow Applications Scheduling With Geo-Distributed Data
    Li, Xiaoping
    Yu, Wei
    Ruiz, Ruben
    Zhu, Jie
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (02) : 891 - 903
  • [28] Analysis of Cost Minimization Methods in Geo-Distributed Data Centers
    Khalaf, Ayesheh Ahrari
    Abdalla, Aisha Hassan
    PROCEEDINGS OF 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE 2016), 2016, : 241 - 245
  • [29] Resilient application placement for geo-distributed cloud networks
    Spinnewyn, Bart
    Mennes, Ruben
    Felipe Botero, Juan
    Latre, Steven
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2017, 85 : 14 - 31
  • [30] Workload Based Geo-Distributed Data Center Planning in Fast Developing Economies
    Liu, Ruiyun
    Sun, Weiqiang
    Hu, Weisheng
    IEEE ACCESS, 2020, 8 (224269-224282): : 224269 - 224282