A task distribution algorithm for energy consumption optimization of MapReduce system

被引:0
|
作者
Song J. [1 ]
Xu S. [1 ]
Guo C.-P. [1 ]
Bao Y.-B. [2 ]
Yu G. [2 ]
机构
[1] Software College, Northeastern University, Shenyang
[2] School of Information Science and Engineering, Northeastern University, Shenyang
来源
基金
中国国家自然科学基金;
关键词
Big data; Cloud computing; Energy consumption; Energy consumption optimization; MapReduce; Parallelism; Task distribution;
D O I
10.11897/SP.J.1016.2016.00323
中图分类号
学科分类号
摘要
While MapReduce is proposed as a typical distributed computing model, it caused a huge repercussion and applied rapidly to big data processing. However, its energy consumption still can be optimized, for the distributed parallel computing system, the parallelism of tasks is the key to performance, the parallelism ensuring approach should consider not only time consumption but also energy consumption. In order to improve the parallelism, the traditional Map task distribution algorithm use the “fine granular task distribution strategy” to improve the parallelism, but it wastes the energy; and Reduce task distribution algorithm cannot guarantee the parallelism among the Reduce tasks. In this paper, we optimize MapReduce by adjusting the size of Map tasks and Reduce tasks dynamically, which can save energy consumed by MapReduce system. The algorithm proposed to this paper has been proved effective in reducing the energy consumption through a series of experiments. © 2016, Science Press. All right reserved.
引用
收藏
页码:323 / 338
页数:15
相关论文
共 30 条
  • [1] Huang S., Wang B.-T., Wang G.-R., Et al., A survey on MapReduce optimization technologies, Journal of Frontiers of Computer Science and Technology, 7, 10, pp. 865-885, (2013)
  • [2] Liu Y., Jing N., Chen L., Et al., Algorithm for processing k-nearest join based on R-tree in MapReduce, Journal of Software, 24, 8, pp. 1836-1851, (2013)
  • [3] Bu Y., Howe B., Balazinska M., Et al., HaLoop: Efficient iterative data processing on large clusters, Proceedings of the VLDB Endowment, 3, 1-2, pp. 285-296, (2010)
  • [4] Koomey J., Growth in Data Center Electricity Use 2005 to 2010, (2011)
  • [5] Ricardo B., Ramakrishnan R., Power and energy management for server systems, IEEE Computer, 37, 11, pp. 68-76, (2004)
  • [6] Zhang T.-T., Wu X., Li C.-D., Dong Y.-W., On energy-consumption analysis and evaluation for component-based embedded system with CSP, Chinese Journal of Computers, 32, 9, pp. 1876-1883, (2009)
  • [7] Chen L.-Q., Shao Z.-Q., Fan G.-S., Energy consumption modeling and analysis for distributed real-time and embedded systems, Journal of East China University of Science and Technology (Natural Science Edition), 35, 2, pp. 250-255, (2009)
  • [8] Song J., Li T.-T., Zhu Z.-L., Et al., Benchmarking and analyzing the energy consumption of cloud data management system, Chinese Journal of Computers, 36, 7, pp. 1485-1499, (2013)
  • [9] Blanas S., Patel J.M., Et al., A comparison of join algorithms for log processing in MapReduce, Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 975-986, (2010)
  • [10] Isard M., Prabhakaran V., Currey J., Et al., Quiney: Fair scheduling for distributed computing clusters, Proceedings of the ACM SIGOPS 22nd Symposium on Operating System Principles, pp. 176-261, (2009)