Assessing Data Usefulness for Failure Analysis in Anonymized System Logs

被引:4
|
作者
Ghiasvand, Siavash [1 ]
Ciorba, Florina M. [2 ]
机构
[1] Tech Univ Dresden, Dresden, Germany
[2] Univ Basel, Basel, Switzerland
关键词
D O I
10.1109/ISPDC2018.2018.00031
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
System logs are a valuable source of information for the analysis and understanding of systems behavior for the purpose of improving their performance. Such logs contain various types of information, including sensitive information. Information deemed sensitive can either directly be extracted from system log entries by correlation of several log entries, or can be inferred from the combination of the (non-sensitive) information contained within system logs with other logs and/or additional datasets. The analysis of system logs containing sensitive information compromises data privacy. Therefore, various anonymization techniques, such as generalization and suppression have been employed, over the years, by data and computing centers to protect the privacy of their users, their data, and the system as a whole. Privacy-preserving data resulting from anonymization via generalization and suppression may lead to significantly decreased data usefulness, thus, hindering the intended analysis for understanding the system behavior. Maintaining a balance between data usefulness and privacy preservation, therefore, remains an open and important challenge. Irreversible encoding of system logs using collision-resistant hashing algorithms, such as SHAKE-128, is a novel approach previously introduced by the authors to mitigate data privacy concerns. The present work describes a study of the applicability of the encoding approach from earlier work on the system logs of a production high performance computing system. Moreover, a metric is introduced to assess the data usefulness of the anonymized system logs to detect and identify the failures encountered in the system.
引用
收藏
页码:164 / 171
页数:8
相关论文
共 50 条
  • [21] Analysis of Heterogeneous Failure Data in Electrical Power System
    Gono, Radomir
    Kratky, Michal
    Rusek, Stanislav
    PROCEEDINGS OF THE 7TH INTERNATIONAL SCIENTIFIC CONFERENCE ELECTRIC POWER ENGINEERING 2006, 2006, : 75 - 79
  • [22] Failure Data Analysis of Coal Mine Ventilator System
    Yang, Dongpeng
    Li, Jinlin
    Ran, Lun
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON RISK AND RELIABILITY MANAGEMENT, VOLS I AND II, 2008, : 682 - 685
  • [23] FAILURE DATA-ANALYSIS OF A COMPUTER-SYSTEM
    CORBY, B
    ALAIWAN, H
    MODELING TECHNIQUES AND TOOLS FOR COMPUTER PERFORMANCE EVALUATION, 1989, : 367 - 383
  • [24] An overview of the legal system and the statistical disclosure control techniques on the anonymized data of official statistics
    Kobayashi, Yoshiyuki
    Journal of the Institute of Electronics, Information and Communication Engineers, 2015, 98 (03): : 212 - 217
  • [25] Time-based analysis of search data logs
    Ozmutlu, HC
    Spink, A
    IC'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, VOLS I AND II, 2001, : 41 - 46
  • [26] Thoughts on the usefulness of a new scoring system for heart failure
    Meregalli, P.
    NETHERLANDS HEART JOURNAL, 2022, 30 (09) : 400 - 401
  • [27] Thoughts on the usefulness of a new scoring system for heart failure
    P. Meregalli
    Netherlands Heart Journal, 2022, 30 : 400 - 401
  • [28] Enabling Proactive Self-Healing by Data Mining Network Failure Logs
    Hashmi, Umair Sajid
    Darbandi, Arsalan
    Imran, Ali
    2017 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2016, : 511 - 517
  • [29] Effects of mental illness on the labor supply of family members: analysis of Japanese anonymized data
    Niu, Bing
    ECONOMICS BULLETIN, 2016, 36 (01): : 35 - +
  • [30] Assessing the usefulness of store card data in direct sales of financial services
    Berry, Stewart J.
    Longley, Paul
    JOURNAL OF RETAILING AND CONSUMER SERVICES, 2005, 12 (06) : 407 - 417