Anomaly Detection for Big Log Data Using a Hadoop Ecosystem

被引:0
|
作者
Son, Siwoon [1 ]
Gil, Myeong-Seon [1 ]
Moon, Yang-Sae [1 ]
机构
[1] Kangwon Natl Univ, Dept Comp Sci, Chunchon, Gangwon Do, South Korea
来源
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) | 2017年
关键词
Anomaly Detection; Big Data; Log Data; Apache Hadoop; Apache Hive; Moving Average; 3-Sigma;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we address a novel method to efficiently manage and analyze a large amount of log data. First, we present a new Apache Hive-based data storage and analysis architecture to process a large volume of Hadoop log data, which rapidly occur in multiple nodes. Second, we design and implement three simple but efficient anomaly detection methods. These methods use moving average and 3-sigma techniques to detect anomalies in log data. Finally, we show that all the three methods detect abnormal intervals properly, and the weighted anomaly detection methods are more precise than the basic one. These results indicate that our research is an excellent and simple approach in detecting anomalies of log data on a Hadoop ecosystem.
引用
收藏
页码:377 / 380
页数:4
相关论文
共 50 条
  • [21] Security framework using Hadoop for Big Data
    Johri, Prashant
    Kumar, Arun
    Das, Sanjoy
    Arora, Sanchita
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 268 - 272
  • [22] Big Data Compression using SPIHT in Hadoop
    Jati, Grafika
    Kusuma, Ilham
    Hilman, M. H.
    Jatmiko, Wisnu
    2016 INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS), 2016, : 133 - 137
  • [23] Big Data Analysis using Apache Hadoop
    Manikandan, Shankar Ganesh
    Ravi, Siddarth
    2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,
  • [24] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [25] Using Hadoop on the Mainframe: A Big Solution for the Challenges of Big Data
    Seay, Cameron
    Agrawal, Rajeev
    Kadadi, Anirudh
    Barel, Yannick
    2015 12TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY - NEW GENERATIONS, 2015, : 765 - 769
  • [26] Big Data Analysis Using Hadoop Cluster
    Saldhi, Ankita
    Goel, Abhinav
    Yadav, Dipesh
    Saldhi, Ankur
    Saksena, Dhruv
    Indu, S.
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 572 - 575
  • [27] Robust Log-Based Anomaly Detection on Unstable Log Data
    Zhang, Xu
    Xu, Yong
    Lin, Qingwei
    Qiao, Bo
    Zhang, Hongyu
    Dang, Yingnong
    Xie, Chunyu
    Yang, Xinsheng
    Cheng, Qian
    Li, Ze
    Chen, Junjie
    He, Xiaoting
    Yao, Randolph
    Lou, Jian-Guang
    Chintalapati, Murali
    Shen, Furao
    Zhang, Dongmei
    ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 807 - 817
  • [28] Tuning small analytics on Big Data: Data partitioning and secondary indexes in the Hadoop ecosystem
    Romero, Oscar
    Herrero, Victor
    Abello, Alberto
    Ferrarons, Jaume
    INFORMATION SYSTEMS, 2015, 54 : 336 - 356
  • [29] Anomaly Detection in Hadoop Clusters Using PCA and DBSCAN
    Yang, Xiao
    Liu, Yan
    Liu, Zunhe
    Cao, Buyang
    INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 541 - 552
  • [30] Domain Adaptive Log Anomaly Prediction for Hadoop System
    Xie, Yuxia
    Yang, Kai
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (20): : 20778 - 20787