An Enhanced Data-Locality-Aware Task Scheduling Algorithm for Hadoop Applications

被引:13
|
作者
Choi, Dongjoo [1 ]
Jeon, Myunghoon [1 ]
Kim, Namgi [1 ]
Lee, Byoung-Dai [1 ]
机构
[1] Kyonggi Univ, Comp Sci Dept, Suwon 443760, South Korea
来源
IEEE SYSTEMS JOURNAL | 2018年 / 12卷 / 04期
基金
新加坡国家研究基金会;
关键词
Data locality; Hadoop distributed file system (HDFS); MapReduce; task scheduling;
D O I
10.1109/JSYST.2017.2764481
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In general, Hadoop improves the task scheduling performance by determining data locality based on the location in which the input splits and MapTask are executed. However, if an input split consists of multiple data blocks that are distributed and stored in different nodes, this data location method fails to cope with the degradation in processing performance due to the increased frequency of data block copying. We propose a task scheduling algorithm that solves this issue by defining a method to classify data locality taking into account the location of all data blocks that comprise an input split, categorizing tasks based on the defined method, and sequentially assigning tasks according to a given priority. This study measures the performance of the proposed algorithm through a comparison of the total processing time, MapTask performance time, and data block copying frequency between the proposed algorithm and Hadoop's default task scheduling algorithm. The test results show that the proposed algorithm improved the total processing time by up to 25% and the data block copying frequency by up to 28%, when compared to the default algorithm.
引用
收藏
页码:3346 / 3357
页数:12
相关论文
共 50 条
  • [1] An improved task scheduling algorithm based on cache locality and data locality in Hadoop
    Zhang, Peng
    Li, Chunlin
    Zhao, Yahui
    2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 244 - 249
  • [2] Data-locality-aware mapreduce real-time scheduling framework
    Kao, Yu-Chon
    Chen, Ya-Shu
    JOURNAL OF SYSTEMS AND SOFTWARE, 2016, 112 : 65 - 77
  • [3] A data-locality-aware task scheduler for distributed social graph queries
    Jin, Jiahui
    Luo, Junzhou
    Du, Mingyang
    Dang, Yongcheng
    Li, Feng
    Zhang, Jinghui
    Song, Aibo
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 93 : 1010 - 1022
  • [4] Shareability and locality aware scheduling algorithm in Hadoop for mobile cloud computing
    Wei, Hsin-Wen
    Wu, Tin-Yu
    Lee, Wei-Tsong
    Hsu, Che-Wei
    Journal of Information Hiding and Multimedia Signal Processing, 2015, 6 (06): : 1215 - 1230
  • [5] Locality and Network-Aware Reduce Task Scheduling for Data-Intensive Applications
    Arslan, Engin
    Shekhar, Mrigank
    Kosar, Tevfik
    2014 5TH INTERNATIONAL WORKSHOP ON DATA-INTENSIVE COMPUTING IN THE CLOUDS (DATACLOUD), 2014, : 17 - 24
  • [6] Locality Aware Task Scheduling in Parallel Data Stream Processing
    Falt, Zbynek
    Krulis, Martin
    Bednarek, David
    Yaghob, Jakub
    Zavoral, Filip
    INTELLIGENT DISTRIBUTED COMPUTING VIII, 2015, 570 : 331 - 342
  • [7] RTSBL: Reduce Task Scheduling Based on the Load Balancing and the Data Locality in Hadoop
    Midoun, Khadidja
    Hidouci, Walid-Khaled
    Loudini, Malik
    Belayadi, Djahida
    ADVANCES IN COMPUTING SYSTEMS AND APPLICATIONS, 2019, 50 : 271 - 280
  • [8] LaSA: A Locality-aware Scheduling Algorithm for Hadoop-MapReduce Resource Assignment
    Chen, Tseng-Yi
    Wei, Hsin-Wen
    Wei, Ming-Feng
    Chen, Ying-Jie
    Hsu, Tsan-Sheng
    Shih, Wei-Kuan
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2013, : 342 - 346
  • [9] An Optimal Locality-Aware Task Scheduling Algorithm Based on Bipartite Graph Modelling for Spark Applications
    Fu, Zhongming
    Tang, Zhuo
    Yang, Li
    Liu, Chubo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (10) : 2406 - 2420
  • [10] Data-Locality-Aware User Grouping in Cloud Radio Access Networks
    Ao, Weng Chon
    Psounis, Konstantinos
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2018, 17 (11) : 7295 - 7308