Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT Model

被引:0
|
作者
Corbelle, Clara [1 ]
Carneiro, Victor [1 ]
Cacheda, Fidel [1 ]
机构
[1] Univ A Coruna, Dept Comp Sci & Informat Technol, La Coruna 15071, Spain
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 13期
关键词
system logs; anomaly detection; BERT model; hierarchical codes; semantic similarity;
D O I
10.3390/app14135388
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The compaction and structuring of system logs facilitate and expedite anomaly and cyberattack detection processes using machine-learning techniques, while simultaneously reducing alert fatigue caused by false positives. In this work, we implemented an innovative algorithm that employs hierarchical codes based on the semantics of natural language, enabling the generation of a significantly reduced log that preserves the semantics of the original. This method uses codes that reflect the specificity of the topic and its position within a higher hierarchical structure. By applying this catalog to the analysis of logs from the Hadoop Distributed File System (HDFS), we achieved a concise summary with non-repetitive themes, significantly speeding up log analysis and resulting in a substantial reduction in log size while maintaining high semantic similarity. The resulting log has been validated for anomaly detection using the "bert-base-uncased" model and compared with six other methods: PCA, IM, LogCluster, SVM, DeepLog, and LogRobust. The reduced log achieved very similar values in precision, recall, and F1-score metrics, but drastically reduced processing time.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model
    Chen, Song
    Liao, Hai
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [2] Anomaly Based Intrusion Detection System Using Hierarchical Classification and Clustering Techniques
    Bahjat, Hala
    Mohammed, Suhaila N.
    Ahmed, Wafaa
    Hamad, Sumaya
    Mohammed, Shayma
    2020 13TH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2020), 2020, : 257 - 262
  • [3] Anomaly Intrusion Detection System Using Hierarchical Gaussian Mixture Model
    Bahrololum, M.
    Khaleghi, M.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (08): : 264 - 271
  • [4] Anomaly Detection Using System Logs: A Deep Learning Approach
    Sinha, Rohit
    Sur, Rittika
    Sharma, Ruchi
    Shrivastava, Avinash K.
    INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2022, 16 (01)
  • [5] A Survey of Deep Anomaly Detection for System Logs
    Zhao, Xiaoqing
    Jiang, Zhongyuan
    Ma, Jianfeng
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] System anomaly detection: Mining firewall logs
    Winding, Robert
    Wright, Timothy
    Chapple, Michael
    2006 SECURECOMM AND WORKSHOPS, 2006, : 389 - +
  • [7] LAnoBERT: System log anomaly detection based on BERT masked language model
    Lee, Yukyung
    Kim, Jina
    Kang, Pilsung
    APPLIED SOFT COMPUTING, 2023, 146
  • [8] xSemAD: Explainable Semantic Anomaly Detection in Event Logs Using Sequence-to-Sequence Models
    Busch, Kiran
    Kampik, Timotheus
    Leopold, Henrik
    BUSINESS PROCESS MANAGEMENT, BPM 2024, 2024, 14940 : 309 - 327
  • [9] Anomaly Detection in Logs Using Deep Learning
    Aziz, Ayesha
    Munir, Kashif
    IEEE ACCESS, 2024, 12 : 176124 - 176135
  • [10] Valid Probabilistic Anomaly Detection Models for System Logs
    Liu, Chunbo
    Pan, Lanlan
    Gu, Zhaojun
    Wang, Jialiang
    Ren, Yitong
    Wang, Zhi
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2020, 2020