Anomaly Detection for Big Log Data Using a Hadoop Ecosystem

被引:0
|
作者
Son, Siwoon [1 ]
Gil, Myeong-Seon [1 ]
Moon, Yang-Sae [1 ]
机构
[1] Kangwon Natl Univ, Dept Comp Sci, Chunchon, Gangwon Do, South Korea
关键词
Anomaly Detection; Big Data; Log Data; Apache Hadoop; Apache Hive; Moving Average; 3-Sigma;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we address a novel method to efficiently manage and analyze a large amount of log data. First, we present a new Apache Hive-based data storage and analysis architecture to process a large volume of Hadoop log data, which rapidly occur in multiple nodes. Second, we design and implement three simple but efficient anomaly detection methods. These methods use moving average and 3-sigma techniques to detect anomalies in log data. Finally, we show that all the three methods detect abnormal intervals properly, and the weighted anomaly detection methods are more precise than the basic one. These results indicate that our research is an excellent and simple approach in detecting anomalies of log data on a Hadoop ecosystem.
引用
收藏
页码:377 / 380
页数:4
相关论文
共 50 条
  • [31] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
    Rathore, M. Mazhar
    Son, Hojae
    Ahmad, Awais
    Paul, Anand
    Jeon, Gwanggil
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (03) : 630 - 646
  • [32] Anomaly Detection using Distributed Log Data: A Lightweight Federated Learning Approach
    Guo, Yalan
    Wu, Yulei
    Zhu, Yanchao
    Yang, Bingqiang
    Han, Chunjing
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [33] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
    M. Mazhar Rathore
    Hojae Son
    Awais Ahmad
    Anand Paul
    Gwanggil Jeon
    International Journal of Parallel Programming, 2018, 46 : 630 - 646
  • [34] Anomaly Detection Guidelines for Data Streams in Big Data
    Rana, Annie Ibrahim
    Estrada, Giovani
    Sole, Marc
    Muntes, Victor
    2016 3RD INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2016), 2016, : 94 - 98
  • [35] Anomaly Detection in Renewable Energy Big Data Using Deep Learning
    Katamoura, Suzan MohammadAli
    Aksoy, Mehmet Sabih
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2023, 19 (01)
  • [36] Testing of algorithms for anomaly detection in Big data using apache spark
    Lighari, Sheeraz Niaz
    Hussain, Dil Muhammad Akbar
    2017 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2017, : 97 - 100
  • [37] Collective Anomaly Detection Using Big Data Distributed Stream Analytics
    Amen, Bakhtiar
    Grigoris, Antoniou
    2018 14TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2018, : 188 - 195
  • [38] KMDT: A Hybrid Cluster Approach for Anomaly Detection Using Big Data
    Thakur, Santosh
    Dharavath, Ramesh
    INFORMATION AND DECISION SCIENCES, 2018, 701 : 169 - 176
  • [39] A Big Data Framework for Mining Sensor Data Using Hadoop
    El-Shafeiy, Engy A.
    El-Desouky, Ali I.
    STUDIES IN INFORMATICS AND CONTROL, 2017, 26 (03): : 365 - 376
  • [40] Anomaly Detection and Root Cause Analysis on Log Data
    Pasha, Daem
    Shah, Ali Hussain
    Zadeh, Esmaeil Habib
    Konur, Savas
    ARTIFICIAL INTELLIGENCE XXXIX, AI 2022, 2022, 13652 : 333 - 339