Fast Big Data Analysis in Geo-Distributed Cloud

被引:2
|
作者
Li, Yue [1 ]
Zhao, Laiping [2 ]
Cui, Chenzhou [3 ]
Yu, Ce [1 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Tianjin Univ, Sch Comp Software, Tianjin, Peoples R China
[3] CAS NAOC, Natl Astron Observ, Beijing, Peoples R China
关键词
D O I
10.1109/CLUSTER.2016.28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As cloud services grow to span more and more globally distributed datacenters, there is an increasingly need for scheduling algorithms to automatically place tasks across these datacenters. In geo-distributed cloud, the limited WAN bandwidth has become the major bottleneck in fast big data analytics. The scheduling algorithm needs to minimize the global completion time, by jointly optimizing task scheduling and WAN data transfer. In this paper, we model the task scheduling as a community detection problem, with respect to the dependency relations between task, data, and datacenters, and propose a Community Detection-based Scheduling (CDS) algorithm, which is able to minimize the WAN data transfer volume. We utilize the real China-Astronomy-Cloud network to evaluate the proposed algorithms. Experimental results show that we can reduce the total data transfer volume by up to 40.7%, and the global completion time by up to 35.8%, compared with the Hypergraph Partition-based scheduling algorithm and the greedy scheduling algorithm.
引用
收藏
页码:388 / 391
页数:4
相关论文
共 50 条
  • [31] Accelerating Geo-Distributed Transaction Processing with Fast Logging
    Ogura, Takuto
    Akita, Yoshiki
    Miyazawa, Yuki
    Kawashima, Hideyuki
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2390 - 2399
  • [32] Efficient Geo-Distributed Data Processing with Rout
    Jayalath, Chamikara
    Eugster, Patrick
    2013 IEEE 33RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2013, : 470 - 480
  • [33] Low Latency Geo-distributed Data Analytics
    Pu, Qifan
    Ananthanarayanan, Ganesh
    Bodik, Peter
    Kandula, Srikanth
    Akella, Aditya
    Bahl, Paramvir
    Stoica, Ion
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2015, 45 (04) : 421 - 434
  • [34] Data locality optimization based on data migration and hotspots prediction in geo-distributed cloud environment
    Li, Chunlin
    Zhang, Jing
    Ma, Tao
    Tang, Hengliang
    Zhang, Lei
    Luo, Youlong
    KNOWLEDGE-BASED SYSTEMS, 2019, 165 : 321 - 334
  • [35] Genetic Based Data Placement for Geo-Distributed Data-Intensive Applications in Cloud Computing
    Fan, Weifeng
    Peng, Jun
    Zhang, Xiaoyong
    Huang, Zhiwu
    ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 253 - 265
  • [36] A survey on bandwidth-aware geo-distributed frameworks for big-data analytics
    Mohammed Bergui
    Said Najah
    Nikola S. Nikolov
    Journal of Big Data, 8
  • [37] Compliant Geo-distributed Data Processing in Action
    Beedkar, Kaustubh
    Brekardin, David
    Quiane-Ruiz, Jorge-Anulfo
    Markl, Volker
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 2843 - 2846
  • [38] A survey on bandwidth-aware geo-distributed frameworks for big-data analytics
    Bergui, Mohammed
    Najah, Said
    Nikolov, Nikola S.
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [39] Low Latency Geo-distributed Data Analytics
    Pu, Qifan
    Ananthanarayanan, Ganesh
    Bodik, Peter
    Kandula, Srikanth
    Akella, Aditya
    Bahl, Paramvir
    Stoica, Ion
    SIGCOMM'15: PROCEEDINGS OF THE 2015 ACM CONFERENCE ON SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2015, : 421 - 434
  • [40] On Achieving Cost-Effective Adaptive Cloud Gaming in Geo-Distributed Data Centers
    Tian, Hao
    Wu, Di
    He, Jian
    Xu, Yuedong
    Chen, Min
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (12) : 2064 - 2077