No More Data Silos: Unified Microservice Failure Diagnosis With Temporal Knowledge Graph

被引:1
|
作者
Zhang, Shenglin [1 ]
Zhao, Yongxin [2 ]
Xia, Sibo [2 ]
Wei, Shirui [3 ]
Sun, Yongqian [4 ]
Zhao, Chenyu [5 ]
Ma, Shiyu [2 ]
Kuang, Junhua [2 ]
Zhu, Bolin [6 ]
Pan, Lemeng [7 ]
Guo, Yicheng [7 ]
Pei, Dan [8 ]
机构
[1] Nankai Univ, Coll Software, Haihe Lab Informat Technol Applicat Innovat HL IT, Tianjin 300071, Peoples R China
[2] Nankai Univ, Tianjin 300192, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 101408, Peoples R China
[4] Nankai Univ, Coll Software, Tianjin Key Lab Software Experience & Human Comp I, Tianjin 300192, Peoples R China
[5] Alibaba Grp, Beijing 100020, Peoples R China
[6] Nanjing Univ, Nanjing 210093, Peoples R China
[7] Huawei Technol Co Ltd, Shenzhen 518129, Peoples R China
[8] Tsinghua Univ, Dept Comp Sci, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Microservice architectures; Measurement; Electronic mail; Prevention and mitigation; Fuses; Monitoring; Anomaly detection; Accuracy; Time factors; Microservice; failure diagnosis; multimodal data; knowledge graph;
D O I
10.1109/TSC.2024.3489444
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microservices improve the scalability and flexibility of monolithic architectures to accommodate the evolution of software systems, but the complexity and dynamics of microservices challenge system reliability. Ensuring microservice quality requires efficient failure diagnosis, including detection and triage. Failure detection involves identifying anomalous behavior within the system, while triage entails classifying the failure type and directing it to the engineering team for resolution. Unfortunately, current approaches reliant on single-modal monitoring data, such as metrics, logs, or traces, cannot capture all failures and neglect interconnections among multimodal data, leading to erroneous diagnoses. Recent multimodal data fusion studies struggle to achieve deep integration, limiting diagnostic accuracy due to insufficiently captured interdependencies. Therefore, we propose UniDiag, which leverages temporal knowledge graphs to fuse multimodal data for effective failure diagnosis. UniDiag applies a simple yet effective stream-based anomaly detection method to reduce computational cost and a novel microservice-oriented graph embedding method to represent the state of systems comprehensively. To assess the performance of UniDiag, we conduct extensive evaluation experiments using datasets from two benchmark microservice systems, demonstrating its superiority over existing methods and affirming the efficacy of multimodal data fusion. Additionally, we have publicly made the code and data available to facilitate further research.
引用
收藏
页码:4013 / 4026
页数:14
相关论文
共 42 条
  • [1] An unified model of temporal knowledge and temporal data
    Tang, Y
    Tang, N
    Ye, XP
    Feng, ZS
    Xiao, W
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOL 1, 2004, : 711 - 713
  • [2] Robust Failure Diagnosis of Microservice System Through Multimodal Data
    Zhang, Shenglin
    Jin, Pengxiang
    Lin, Zihan
    Sun, Yongqian
    Zhang, Bicheng
    Xia, Sibo
    Li, Zhengdan
    Zhong, Zhenyu
    Ma, Minghua
    Jin, Wa
    Zhang, Dai
    Zhu, Zhenyu
    Pei, Dan
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (06) : 3851 - 3864
  • [3] A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and Extrapolation
    Chen, Kai
    Wang, Ye
    Li, Yitong
    Li, Aiping
    Yu, Han
    Song, Xin
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 117 - 132
  • [4] LogKG: Log Failure Diagnosis Through Knowledge Graph
    Sui, Yicheng
    Zhang, Yuzhe
    Sun, Jianjun
    Xu, Ting
    Zhang, Shenglin
    Li, Zhengdan
    Sun, Yongqian
    Guo, Fangrui
    Shen, Junyu
    Zhang, Yuzhi
    Pei, Dan
    Yang, Xiao
    Yu, Li
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (05) : 3493 - 3507
  • [5] Research of Medical Aided Diagnosis System Based on Temporal Knowledge Graph
    Song, Fanfei
    Wang, Bin
    Tang, Yifan
    Sun, Jing
    ADVANCED DATA MINING AND APPLICATIONS, 2020, 12447 : 236 - 250
  • [6] Toward a Unified Cybersecurity Knowledge Graph: Leveraging Ontologies and Open Data Sources
    Boyer, Adam
    Dogdu, Erdogan
    Choupani, Roya
    Watson, Jason S.
    Sanchez, Diego
    Ametu, Alexander
    RECENT ADVANCES IN NEXT-GENERATION DATA SCIENCE, SDSC 2024, 2024, 2158 : 17 - 33
  • [7] Application of Knowledge Graph Technology in Unified Management Platform for Wind Power Data
    Lv, Mengping
    Duan, Bin
    Jiang, Haihui
    Deng Dong
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 1762 - 1766
  • [8] A Framework of Data Fusion Through Spatio-Temporal Knowledge Graph
    Zhang, Xiaohan
    Zhu, Xinning
    Wu, Jie
    Hu, Zheng
    Zhang, Chunhong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 216 - 228
  • [9] Knowledge guided diagnosis prediction via graph spatial-temporal network
    Li, Yang
    Qian, Buyue
    Zhang, Xianli
    Liu, Hui
    PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM), 2020, : 19 - 27
  • [10] Knowledge Graph Construction with a Facade: A Unified Method to Access Heterogeneous Data Sources on theWeb
    Asprino, Luigi
    Daga, Enrico
    Gangemi, Aldo
    Mulholland, Paul
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2023, 23 (01)