A task distribution algorithm for energy consumption optimization of MapReduce system

被引:0
|
作者
Song J. [1 ]
Xu S. [1 ]
Guo C.-P. [1 ]
Bao Y.-B. [2 ]
Yu G. [2 ]
机构
[1] Software College, Northeastern University, Shenyang
[2] School of Information Science and Engineering, Northeastern University, Shenyang
来源
基金
中国国家自然科学基金;
关键词
Big data; Cloud computing; Energy consumption; Energy consumption optimization; MapReduce; Parallelism; Task distribution;
D O I
10.11897/SP.J.1016.2016.00323
中图分类号
学科分类号
摘要
While MapReduce is proposed as a typical distributed computing model, it caused a huge repercussion and applied rapidly to big data processing. However, its energy consumption still can be optimized, for the distributed parallel computing system, the parallelism of tasks is the key to performance, the parallelism ensuring approach should consider not only time consumption but also energy consumption. In order to improve the parallelism, the traditional Map task distribution algorithm use the “fine granular task distribution strategy” to improve the parallelism, but it wastes the energy; and Reduce task distribution algorithm cannot guarantee the parallelism among the Reduce tasks. In this paper, we optimize MapReduce by adjusting the size of Map tasks and Reduce tasks dynamically, which can save energy consumed by MapReduce system. The algorithm proposed to this paper has been proved effective in reducing the energy consumption through a series of experiments. © 2016, Science Press. All right reserved.
引用
收藏
页码:323 / 338
页数:15
相关论文
共 30 条
  • [11] Zaharia M., Borthakur D., Et al., Job scheduling for multi-user MapReduce clusters, (2009)
  • [12] He R.-B., The Performance Optimization and Improvement of MapReduce in Hadoop, (2011)
  • [13] Gu R., Yan J.-S., Et al., Performance optimization for short job execution in Hadoop MapReduce, Journal of Computer Research and Development, 51, 6, pp. 1270-1280, (2014)
  • [14] Babu S., Towards automatic optimization of MapReduce programs, Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 137-142, (2010)
  • [15] Zaharia M., Konwinski A., Joseph A.D., Et al., Improving MapReduce performance in heterogeneous environments, Proceedings of the Operating Systems Design and Implementation, (2008)
  • [16] Xie J., Yin S., Ruan X., Et al., Improving mapreduce performance through data placement in heterogeneous hadoop clusters, Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, pp. 1-9, (2010)
  • [17] Leverich J., Kozyrakis C., On the energy (in) efficiency of Hadoop clusters, ACM SIGOPS Operating Systems Review, 44, 1, pp. 61-65, (2010)
  • [18] Stillwell M., Schanzenbach D., Et al., Resource allocation using virtual clusters, Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 260-267, (2009)
  • [19] Song Y., Wang H., Li Y., Et al., Multi-tiered on-demand resource scheduling for VM-based data center, Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 148-155, (2009)
  • [20] Chen Y.P., Archana G., To compress or not to compress-Compute vs. IO tradeoffs for mapreduce energy efficiency, Proceedings of the 1st ACM SIGCOMM Workshop on Green Networking, pp. 23-28, (2010)