On the effectiveness of log representation for log-based anomaly detection

被引：5

作者：

Wu, Xingfang ^{[1
]}

Li, Heng ^{[1
]}

Khomh, Foutse ^{[1
]}

机构：

[1] Polytech Montreal, Dept Comp Engn & Software Engn, Montreal, PQ, Canada

来源：

EMPIRICAL SOFTWARE ENGINEERING | 2023年 / 28卷 / 06期

关键词：

Log representation; Anomaly detection; Automated log analysis;

D O I：

10.1007/s10664-023-10364-1

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Logs are an essential source of information for people to understand the running status of a software system. Due to the evolving modern software architecture and maintenance methods, more research efforts have been devoted to automated log analysis. In particular, machine learning (ML) has been widely used in log analysis tasks. In ML-based log analysis tasks, converting textual log data into numerical feature vectors is a critical and indispensable step. However, the impact of using different log representation techniques on the performance of the downstream models is not clear, which limits researchers and practitioners' opportunities of choosing the optimal log representation techniques in their automated log analysis workflows. Therefore, this work investigates and compares the commonly adopted log representation techniques from previous log analysis research. Particularly, we select six log representation techniques and evaluate them with seven ML models and four public log datasets (i.e., HDFS, BGL, Spirit and Thunderbird) in the context of log-based anomaly detection.We also examine the impacts of the log parsing process and the different feature aggregation approaches when they are employed with log representation techniques. From the experiments, we provide some heuristic guidelines for future researchers and developers to follow when designing an automated log analysis workflow. We believe our comprehensive comparison of log representation techniques can help researchers and practitioners better understand the characteristics of different log representation techniques and provide them with guidance for selecting the most suitable ones for their ML-based log analysis workflow.

引用

页数：39

共 50 条

[31] AFALog: A General Augmentation Framework for Log-based Anomaly Detection with Active Learning
Duan, Chiming
Jia, Tong
Cai, Huaqian
Li, Ying
Huang, Gang
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 46 - 56
[32] Hilogx: noise-aware log-based anomaly detection with human feedback
Tong Jia
Ying Li
Yong Yang
Gang Huang
The VLDB Journal, 2024, 33 : 883 - 900
[33] MoniLog: An Automated Log-Based Anomaly Detection System for Cloud Computing Infrastructures
Vervaet, Arthur
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2739 - 2743
[34] DualAttlog: Context aware dual attention networks for log-based anomaly detection
Yang, Haitian
Sun, Degang
Huang, Weiqing
NEURAL NETWORKS, 2024, 180
[35] Hilogx: noise-aware log-based anomaly detection with human feedback
Jia, Tong
Li, Ying
Yang, Yong
Huang, Gang
VLDB JOURNAL, 2024, 33 (03): : 883 - 900
[36] Log-Based Testing
Elyasov, Alexander
2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2012, : 1591 - 1594
[37] Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation
Yang, Lin
Chen, Junjie
Wang, Zan
Wang, Weijing
Jiang, Jiajun
Dong, Xuyuan
Zhang, Wenbin
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 1448 - 1460
[38] Unsupervised Learning and Online Anomaly Detection: An On-Condition Log-Based Maintenance System
Decker, Leticia
Leite, Daniel
Minarini, Francesco
Tisbeni, Simone Rossi
Bonacorsi, Daniele
INTERNATIONAL JOURNAL OF EMBEDDED AND REAL-TIME COMMUNICATION SYSTEMS (IJERTCS), 2022, 13 (01):
[39] Log-based Anomaly Detection from Multi-view by Associating Anomaly Scores with User Trust
Wang, Lin
Zhang, Kun
Li, Chen
Tu, Bibo
2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 643 - 650
[40] Augmenting Log-based Anomaly Detection Models to Reduce False Anomalies with Human Feedback
Jia, Tong
Li, Ying
Yang, Yong
Huang, Gang
Wu, Zhonghai
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 3081 - 3089

← 1 2 3 4 5 →