Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs

被引：0

作者：

Mohd Usama

Mengchen Liu

Min Chen

机构：

[1] EmbeddedandPervasiveComputing(EPIC)Lab,SchoolofComputerScienceandTechnology,HuazhongUniversityofScienceandTechnology

来源：

Digital Communications and Networks | 2017年 / 3卷 / 04期

关键词：

D O I：

暂无

中图分类号：

TP311.13 [];

学科分类号：

1201 ;

摘要：

At present, big data is very popular, because it has proved to be much successful in many fields such as social media, E-commerce transactions, etc. Big data describes the tools and technologies needed to capture, manage,store, distribute, and analyze petabyte or larger-sized datasets having different structures with high speed. Big data can be structured, unstructured, or semi structured. Hadoop is an open source framework that is used to process large amounts of data in an inexpensive and efficient way, and job scheduling is a key factor for achieving high performance in big data processing. This paper gives an overview of big data and highlights the problems and challenges in big data. It then highlights Hadoop Distributed File System(HDFS), Hadoop Map Reduce, and various parameters that affect the performance of job scheduling algorithms in big data such as Job Tracker, Task Tracker, Name Node, Data Node, etc. The primary purpose of this paper is to present a comparative study of job scheduling algorithms along with their experimental results in Hadoop environment. In addition, this paper describes the advantages, disadvantages, features, and drawbacks of various Hadoop job schedulers such as FIFO,Fair, capacity, Deadline Constraints, Delay, LATE, Resource Aware, etc, and provides a comparative study among these schedulers.

引用

页码：260 / 273

页数：14

共 13 条

[1] Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs
Usama, Mohd
Liu, Mengchen
Chen, Min
DIGITAL COMMUNICATIONS AND NETWORKS, 2017, 3 (04) : 260 - 273
[2] Statistical analysis of multi job processing in Hadoop environment using schedulers
Prasad, M. S. Guru
Singh, Prabhdeep
Taneja, Harsh
Jain, Amith K.
Chandrappa, S.
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2022, 43 (03): : 497 - 504
[3] Performance Analysis of Hadoop YARN Job Schedulers in a Multi-Tenant Environment on HiBench Benchmark Suite
Bawankule, Kamalakant Laxman
Dewang, Rupesh Kumar
Singh, Anil Kumar
INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2021, 12 (03) : 64 - 82
[4] Processing Real World Datasets using Big Data Hadoop Tools
Deshai, N.
Sekhar, B. V. D. S.
Reddy, P. V. G. D. Prasad
Chakravarthy, V. V. S. S. S.
JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2020, 79 (07): : 631 - 635
[5] Real-life insights on menstrual cycles and ovulation using big data
Soumpasis, I
Grace, B.
Johnson, S.
HUMAN REPRODUCTION OPEN, 2020, 2020 (02)
[6] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
Rathore, M. Mazhar
Son, Hojae
Ahmad, Awais
Paul, Anand
Jeon, Gwanggil
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (03) : 630 - 646
[7] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
M. Mazhar Rathore
Hojae Son
Awais Ahmad
Anand Paul
Gwanggil Jeon
International Journal of Parallel Programming, 2018, 46 : 630 - 646
[8] Real-time data processing scheme using big data analytics in internet of things based smart transportation environment
Babar, Muhammad
Arif, Fahim
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019, 10 (10) : 4167 - 4177
[9] Real-time data processing scheme using big data analytics in internet of things based smart transportation environment
Muhammad Babar
Fahim Arif
Journal of Ambient Intelligence and Humanized Computing, 2019, 10 : 4167 - 4177
[10] Neighborhood search-based job scheduling for IoT big data real-time processing in distributed edge-cloud computing environment
Chunlin Li
YiHan Zhang
Youlong Luo
The Journal of Supercomputing, 2021, 77 : 1853 - 1878

← 1 2 →