Assessing Data Usefulness for Failure Analysis in Anonymized System Logs

被引:4
|
作者
Ghiasvand, Siavash [1 ]
Ciorba, Florina M. [2 ]
机构
[1] Tech Univ Dresden, Dresden, Germany
[2] Univ Basel, Basel, Switzerland
关键词
D O I
10.1109/ISPDC2018.2018.00031
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
System logs are a valuable source of information for the analysis and understanding of systems behavior for the purpose of improving their performance. Such logs contain various types of information, including sensitive information. Information deemed sensitive can either directly be extracted from system log entries by correlation of several log entries, or can be inferred from the combination of the (non-sensitive) information contained within system logs with other logs and/or additional datasets. The analysis of system logs containing sensitive information compromises data privacy. Therefore, various anonymization techniques, such as generalization and suppression have been employed, over the years, by data and computing centers to protect the privacy of their users, their data, and the system as a whole. Privacy-preserving data resulting from anonymization via generalization and suppression may lead to significantly decreased data usefulness, thus, hindering the intended analysis for understanding the system behavior. Maintaining a balance between data usefulness and privacy preservation, therefore, remains an open and important challenge. Irreversible encoding of system logs using collision-resistant hashing algorithms, such as SHAKE-128, is a novel approach previously introduced by the authors to mitigate data privacy concerns. The present work describes a study of the applicability of the encoding approach from earlier work on the system logs of a production high performance computing system. Moreover, a metric is introduced to assess the data usefulness of the anonymized system logs to detect and identify the failures encountered in the system.
引用
收藏
页码:164 / 171
页数:8
相关论文
共 50 条
  • [41] Analysis of Requirement-errors-caused Failure of On-board Subsystem of CTCS-3 Train Control System Based on Failure Logs
    Han, Xiao
    Tang, Tao
    Lü, Jidong
    Shang, Linyu
    Tiedao Xuebao/Journal of the China Railway Society, 2017, 39 (03): : 59 - 70
  • [42] Design and Analysis of Interoperable Data Logs for Augmentative Communication Practice
    Chen, Szu-Han Kay
    Wadhwa, Soumya
    Nyberg, Eric
    ASSETS'19: THE 21ST INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2019, : 533 - 535
  • [43] Big Data Analysis of Cloud Storage Logs using Spark
    Garion, Shelly
    Kolodner, Hillel
    Adir, Allon
    Aharoni, Ehud
    Greenberg, Lev
    SYSTOR'17: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, 2017,
  • [44] Digging Deeper into Cluster System Logs for Failure Prediction and Root Cause Diagnosis
    Fu, Xiaoyu
    Ren, Rui
    Mckee, Sally A.
    Zhan, Jianfeng
    Sun, Ninghui
    2014 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2014, : 103 - 112
  • [45] Usefulness of imputation for the analysis of incomplete otoneurologic data
    Laurikkala, J
    Kentala, E
    Juhola, M
    Pyykkö, I
    Lammi, S
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2000, 58 : 235 - 242
  • [46] Usefulness Analysis of a Clinical Data Repository Design
    Ong, Daphne E.
    Frize, Monique
    Gilchrist, Jeff
    Bariciak, Erika
    Ennett, Colleen M.
    2013 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS PROCEEDINGS (MEMEA), 2013, : 86 - 90
  • [47] Assessing the Usefulness and Acceptance of HERMES MyFuture System in Two European Countries
    Buiza, Cristina
    Belan Navarro, Ana
    Feli Gonzalez, Mari
    Geven, Arjan
    Tscheligi, Manfred
    Prost, Sebastian
    AMBIENT INTELLIGENCE AND FUTURE TRENDS - INTERNATIONAL SYMPOSIUM ON AMBIENT INTELLIGENCE (ISAML 2010), 2010, 72 : 205 - +
  • [48] Using location, bearing and motion data to filter video and system logs
    Morrison, Alistair
    Tennent, Paul
    Williamson, John
    Chalmers, Matthew
    PERVASIVE COMPUTING, PROCEEDINGS, 2007, 4480 : 109 - +
  • [49] Detecting Overlapping Data in System Logs Based on Ensemble Learning Method
    Liu, Chunbo
    Ren, Yitong
    Liang, Mengmeng
    Gu, Zhaojun
    Wang, Jialiang
    Pan, Lanlan
    Wang, Zhi
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2020, 2020
  • [50] Big-Data Analysis of Multi-Source Logs for Anomaly Detection on Network-based System
    Jia Zhanpei
    Shen Chao
    Yi Xiao
    Chen Yufei
    Yu Tianwen
    Guan Xiaohong
    2017 13TH IEEE CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2017, : 1136 - 1141