Fast Big Data Analysis in Geo-Distributed Cloud

被引:2
|
作者
Li, Yue [1 ]
Zhao, Laiping [2 ]
Cui, Chenzhou [3 ]
Yu, Ce [1 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Tianjin Univ, Sch Comp Software, Tianjin, Peoples R China
[3] CAS NAOC, Natl Astron Observ, Beijing, Peoples R China
关键词
D O I
10.1109/CLUSTER.2016.28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As cloud services grow to span more and more globally distributed datacenters, there is an increasingly need for scheduling algorithms to automatically place tasks across these datacenters. In geo-distributed cloud, the limited WAN bandwidth has become the major bottleneck in fast big data analytics. The scheduling algorithm needs to minimize the global completion time, by jointly optimizing task scheduling and WAN data transfer. In this paper, we model the task scheduling as a community detection problem, with respect to the dependency relations between task, data, and datacenters, and propose a Community Detection-based Scheduling (CDS) algorithm, which is able to minimize the WAN data transfer volume. We utilize the real China-Astronomy-Cloud network to evaluate the proposed algorithms. Experimental results show that we can reduce the total data transfer volume by up to 40.7%, and the global completion time by up to 35.8%, compared with the Hypergraph Partition-based scheduling algorithm and the greedy scheduling algorithm.
引用
收藏
页码:388 / 391
页数:4
相关论文
共 50 条
  • [1] Data Centers Selection for Moving Geo-distributed Big Data to Cloud
    Zhang, Jiangtao
    Yuan, Qiang
    Chen, Shi
    Huang, Hejiao
    Wang, Xuan
    JOURNAL OF INTERNET TECHNOLOGY, 2019, 20 (01): : 111 - 122
  • [2] Fast, scalable and geo-distributed PCA for big data analytics
    Adnan, T. M. Tariq
    Tanjim, Md Mehrab
    Adnan, Muhammad Abdullah
    INFORMATION SYSTEMS, 2021, 98 (98)
  • [3] Planning of Geo-Distributed Cloud Data Centers in Fast Developing Economies
    Liu, Ruiyun
    Sun, Weiqiang
    Hu, Weisheng
    2018 20TH ANNIVERSARY INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS (ICTON), 2018,
  • [4] Fast media caching for geo-distributed data centers
    Zhang, Wei
    Wen, Yonggang
    Liu, Fang
    Chen, Yiqiang
    Fan, Rui
    COMPUTER COMMUNICATIONS, 2018, 120 : 46 - 57
  • [5] Bandwidth On-Demand for Multimedia Big Data Transfer Across Geo-Distributed Cloud Data Centers
    Yassine, Abdulsalam
    Shirehjini, Ali Asghar Nazari
    Shirmohammadi, Shervin
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (04) : 1189 - 1198
  • [6] Time Optimization Modeling for Big Data Placement and Analysis for Geo-Distributed Data Centers
    Khan, Awais
    Attique, Muhammad
    Chung, Tae-Sun
    Kim, Youngjae
    2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 140 - 141
  • [7] Analysis of Control Traffic in a Geo-distributed Collaborative Cloud
    Sciammarella, Tatiana
    Couto, Rodrigo S.
    Rubinstein, Marcelo G.
    Campista, Miguel Elias M.
    Costa, Luis Henrique M. K.
    2016 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (IEEE CLOUDNET), 2016, : 224 - 229
  • [8] Dynamic Data Replication Across Geo-Distributed Cloud Data Centres
    Jayalakshmi, D. S.
    Ranjana, T. P. Rashmi
    Ramaswamy, Srinivasan
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2016), 2016, 9581 : 182 - 187
  • [9] Joint Scheduling of Data and Computation in Geo-distributed Cloud Systems
    Yin, Lingyan
    Sun, Jizhou
    Zhao, Laiping
    Cui, Chenzhou
    Xiao, Jian
    Yu, Ce
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 657 - 666
  • [10] Cost Minimization for Big Data Processing in Geo-Distributed Data Centers
    Gu, Lin
    Zeng, Deze
    Li, Peng
    Guo, Song
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (03) : 314 - 323