Job failure prediction in Hadoop based on log file analysis

被引:1
|
作者
Shirzad E. [1 ]
Saadatfar H. [1 ,2 ]
机构
[1] Faculty of Electrical and Computer Engineering, University of Birjand, Birjand
[2] Department of Computer Engineering, Faculty of Electrical and Computer Engineering, University of Birjand, Birjand
关键词
cluster workload; data mining; failure prediction; Hadoop; log file; MapReduce job;
D O I
10.1080/1206212X.2020.1732081
中图分类号
学科分类号
摘要
Hadoop is a popular framework based on MapReduce programming model to allow for distributed processing of large datasets across clusters with various number of computer nodes. Just like any dynamic computational environment, Hadoop has some problems and one of which is unsuccessful execution of MapReduce jobs. Job failures can cause significant resource wasting, performance deterioration, and user dissatisfaction. Therefore, a proactive and predictive management approach could be very useful in Hadoop systems. In this paper, we try to predict the futurity of MapReduce jobs in OpenCloud Hadoop cluster by using its log files. OpenCloud is a research cluster managed by CMU’s Parallel Data Lab which uses Hadoop to process big data. We first tried to study the log files and analyze the relationship between the jobs, resources, and workload characteristics and the failures in order to discover the effective features for the prediction process. After recognizing the job failure patterns, some popular machine learning algorithms are deployed to predict the success/failure status of the jobs before they start to execute. Eventually, we compared the learning methods and showed that the C5.0 algorithm had the best results with an accuracy of 91.37%, a recall of 74.43%, and a precision of 80.31%. © 2020 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:260 / 269
页数:9
相关论文
共 50 条
  • [41] Research of Cloud Storage Based on Hadoop Distributed File System
    Han, Yongqi
    Zhang, Yun
    Yu, Shui
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 2472 - 2475
  • [42] Job Failure Prediction in Grid Environment Based on Workload Characteristics
    Fadishei, Hamid
    Saadatfar, Hamid
    Deldari, Hossein
    2009 14TH INTERNATIONAL COMPUTER CONFERENCE, 2009, : 328 - 333
  • [43] LOG FILE ANALYSIS WITH CONTEXT-FREE GRAMMARS
    Bosman, Gregory
    Gruner, Stefan
    ADVANCES IN DIGITAL FORENSICS IX, 2013, 410 : 145 - 152
  • [44] Network Traffic Prediction Based on Hadoop
    Cui, Hongyan
    Yao, Yuan
    Zhang, Kuo
    Sun, Fangfang
    Liu, Yunjie
    2014 INTERNATIONAL SYMPOSIUM ON WIRELESS PERSONAL MULTIMEDIA COMMUNICATIONS (WPMC), 2014, : 29 - 33
  • [45] General test result checking with log file analysis
    Andrews, JH
    Zhang, YJ
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2003, 29 (07) : 634 - 648
  • [46] The use of log file analysis within VMAT audits
    McGarry, Conor K.
    Agnew, Christina E.
    Hussein, Mohammad
    Tsang, Yatman
    Hounsell, Alan R.
    Clark, Catharine H.
    BRITISH JOURNAL OF RADIOLOGY, 2016, 89 (1062):
  • [47] Complexity Based Test Cases for Log File Analyzers
    Heikkinen, Esa
    Hamalainen, Timo D.
    2017 IEEE 15TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2017, : 1007 - 1012
  • [48] An Efficient Log File Analysis Algorithm Using Binary-Based Data Structure
    Fageeri, Sallam Osman
    Ahmad, Rohiza
    2ND INTERNATIONAL CONFERENCE ON INNOVATION, MANAGEMENT AND TECHNOLOGY RESEARCH, 2014, 129 : 518 - 526
  • [49] Model-based failure analysis of journaling file systems
    Prabhakaran, V
    Arpaci-Dusseau, AC
    Arpaci-Dusseau, RH
    2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 802 - 811
  • [50] Virtual log based file systems for a programmable disk
    Wang, RY
    Anderson, TE
    Patterson, DA
    USENIX ASSOCIATION PROCEEDINGS OF THE THIRD SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '99), 1999, : 29 - 43