Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT Model

被引:0
|
作者
Corbelle, Clara [1 ]
Carneiro, Victor [1 ]
Cacheda, Fidel [1 ]
机构
[1] Univ A Coruna, Dept Comp Sci & Informat Technol, La Coruna 15071, Spain
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 13期
关键词
system logs; anomaly detection; BERT model; hierarchical codes; semantic similarity;
D O I
10.3390/app14135388
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The compaction and structuring of system logs facilitate and expedite anomaly and cyberattack detection processes using machine-learning techniques, while simultaneously reducing alert fatigue caused by false positives. In this work, we implemented an innovative algorithm that employs hierarchical codes based on the semantics of natural language, enabling the generation of a significantly reduced log that preserves the semantics of the original. This method uses codes that reflect the specificity of the topic and its position within a higher hierarchical structure. By applying this catalog to the analysis of logs from the Hadoop Distributed File System (HDFS), we achieved a concise summary with non-repetitive themes, significantly speeding up log analysis and resulting in a substantial reduction in log size while maintaining high semantic similarity. The resulting log has been validated for anomaly detection using the "bert-base-uncased" model and compared with six other methods: PCA, IM, LogCluster, SVM, DeepLog, and LogRobust. The reduced log achieved very similar values in precision, recall, and F1-score metrics, but drastically reduced processing time.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Android Anomaly Detection System Using Machine Learning Classification
    Kurniawan, Harry
    Rosmansyah, Yusep
    Dabarsyah, Budiman
    5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS 2015, 2015, : 288 - 293
  • [22] ConAnomaly: Content-Based Anomaly Detection for System Logs
    Lv, Dan
    Luktarhan, Nurbol
    Chen, Yiyong
    SENSORS, 2021, 21 (18)
  • [23] System Logs Anomaly Detection. Are we on the right path?
    Albert, Ramona-Georgiana
    APPLIED ARTIFICIAL INTELLIGENCE, 2025, 39 (01)
  • [24] Anomaly Detection on System Generated Logs-A Survey Study
    Jose, Jisha M.
    Reeja, S. R.
    MOBILE COMPUTING AND SUSTAINABLE INFORMATICS, 2022, 68 : 779 - 793
  • [25] Adanomaly: Adaptive Anomaly Detection for System Logs with Adversarial Learning
    Qi, Jiaxing
    Luan, Zhongzhi
    Huang, Shaohan
    Wang, Yukun
    Fung, Carol
    Yang, Hailong
    Qian, Depei
    PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
  • [26] Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs
    Nedelkoski, Sasho
    Bogatinovski, Jasmin
    Acker, Alexander
    Cardoso, Jorge
    Kao, Odej
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1196 - 1201
  • [27] Anomaly Detection from Network Logs Using Diffusion Maps
    Sipola, Tuomo
    Juvonen, Antti
    Lehtonen, Joel
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT I, 2011, 363 : 172 - 181
  • [28] Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection
    Sun, Shengyang
    Gong, Xiaojin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22846 - 22856
  • [29] EHR-BERT: A BERT-based model for effective anomaly detection in electronic health records
    Niu, Haoran
    Omitaomu, Olufemi A.
    Langston, Michael A.
    Olama, Mohammad
    Ozmen, Ozgur
    Klasky, Hilda B.
    Laurio, Angela
    Ward, Merry
    Nebeker, Jonathan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 150
  • [30] ELSV: An Effective Anomaly Detection System from Web Access Logs
    Wan, Wei
    Shi, Xin
    Wei, Jinxia
    Zhao, Jing
    Long, Chun
    2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,