Performance Challenges and Solutions in Big Data Platform Hadoop

被引:0
|
作者
Singh B. [1 ,2 ]
Verma H.K. [1 ]
Madaan V. [2 ]
机构
[1] Department of Computer Science and Engineering, Dr. B.R. Ambedkar NIT, Jalandhar
[2] School of Computer Science and Engineering, Lovely Professional University, Phagwara
关键词
big data; Hadoop; load balancing; performance; scheduling; skew;
D O I
10.2174/2666255816666230608165146
中图分类号
学科分类号
摘要
Background: The present era demands continuous support to bring improvements in executing complex analytics on large-scale data and to work beyond traditional systems. Objective: The need for processing diverse data types and solutions for different domains of the industry is rising. Such needs increase the requirement for sophisticated techniques and methods to enhance the existing platforms and mechanisms further. It provides an opportunity for the research community to investigate further into the existing systems, find potential issues, and propose new ways to improve the current systems. Hadoop is a popular choice to manage and process Big data. It is an open-source platform and a front-runner in the batch processing of large-scale jobs. The economy associated with the cluster in scaling is low as compared to other platforms. However, this popularity by no means guarantees high performance in all scenarios. With the continuous evolution in data development and industrial requirements, it is imperative to investigate and look into new methods and techniques to bring advancements to the existing system. Method: A systematic review is represented in this paper to have an insight into the current progress in this field. Research publications from various sources are taken and analyzed. The performance of a cluster largely depends upon the different job processing mechanisms and policies associated with it. Conclusion: While extensive studies and solutions are proposed, the performance bottlenecks in terms of load balancing, resource utilization, content management, and efficient processing prevail. Not many of the solutions are there on scheduling about the trade-off between different parameters, the process of content splitting and merging is not explored to a large extent and the skew mitigation solutions are more focused on Reduce side of the MapReduce while the Map side is not utilized much for load balancing. © 2023 Bentham Science Publishers.
引用
收藏
相关论文
共 50 条
  • [22] Optimizing Hadoop Performance for Big Data Analytics in Smart Grid
    Khan, Mukhtaj
    Huang, Zhengwen
    Li, Maozhen
    Taylor, Gareth A.
    Ashton, Phillip M.
    Khan, Mushtaq
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
  • [23] An Approach to Enhance the Performance of Hadoop MapReduce Framework for Big Data
    Chandra, Subhash
    Motwani, Deepak
    2016 INTERNATIONAL CONFERENCE ON MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING (ICMETE), 2016, : 178 - 182
  • [24] Research on adaptive recommendation algorithm for big data mining based on Hadoop platform
    Zhang, Jinming
    INTERNATIONAL JOURNAL OF INTERNET PROTOCOL TECHNOLOGY, 2019, 12 (04) : 213 - 220
  • [25] EMM: Extended matching market based scheduling for big data platform hadoop
    Balraj Singh
    Harsh K Verma
    Multimedia Tools and Applications, 2022, 81 : 34823 - 34847
  • [26] Big data analysis based on hadoop cluster and spark cluster on linux platform
    Liu, Kangxu
    Li, Guangming
    Journal of Computers (Taiwan), 2020, 31 (02) : 127 - 140
  • [27] EMM: Extended matching market based scheduling for big data platform hadoop
    Singh, Balraj
    Verma, Harsh K.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34823 - 34847
  • [28] Novel Application of DaaS and Hadoop Technology in Big Data Cloud Computing Platform
    Xu, Hongsheng
    Fan, Ganglong
    Li, Ke
    PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON MECHATRONICS, COMPUTER AND EDUCATION INFORMATIONIZATION (MCEI 2017), 2017, 75 : 373 - 377
  • [29] Big Data Analysis and Visualization: Challenges and Solutions
    Yoo, Kwan-Hee
    Leung, Carson K.
    Nasridinov, Aziz
    APPLIED SCIENCES-BASEL, 2022, 12 (16):
  • [30] Big Data Security Challenges and Preventive Solutions
    Gupta, Nirmal Kumar
    Rohil, Mukesh Kumar
    DATA MANAGEMENT, ANALYTICS AND INNOVATION, ICDMAI 2019, VOL 1, 2020, 1042 : 285 - 299