ViSRE: A Unified Visual Analysis Dashboard for Proactive Cloud Outage Management

被引:2
|
作者
Kayongo, Paula [1 ]
Hoffswell, Jane [2 ]
Saini, Shiv [3 ]
Garg, Shaddy [3 ]
Koh, Eunyee [4 ]
Wang, Haoliang [4 ]
Jacobs, Tom [4 ]
机构
[1] Northwestern Univ, Evanston, IL 60208 USA
[2] Adobe Res, Washington, DC USA
[3] Adobe Res, Bangalore, Karnataka, India
[4] Adobe Res, Santa Clara, CA USA
关键词
Cloud Outage Prediction; Root Cause Analysis; Software Visualization;
D O I
10.1109/VISSOFT55257.2022.00010
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Efficient outage detection and remediation is crucial for effectively operating cloud computing systems. To remediate outages, system engineers must quickly identify the causal relationships between metrics and correlate events across multiple monitoring tools. In practice, this process largely remains reactive due to the complexity and general lack of interpretability within such monitoring environments. This work presents ViSRE: an integrated visual analytics system that integrates causal and predictive models with interactive visualizations to aid in proactive cloud outage management. We develop enhanced node representations for our causal graph representation to support system engineers in performing root cause analysis and reasoning about causality chains in multi-dimensional temporal data. We report the results of a quantitative assessment of the proposed predictive models, which show good performance guarantees. To evaluate and refine our system, we conduct a study with six cloud system engineers who verify that our proposed techniques can support proactive cloud maintenance by intuitively displaying temporal relationships between predicted and raw data. By correlating and presenting data from disparate sources, ViSRE also reduces context switching costs and reduces the time spent on manually correlating events during remediation of time-critical outages.
引用
收藏
页码:5 / 16
页数:12
相关论文
共 50 条
  • [41] CSBAuditor: Proactive Security Risk Analysis for Cloud Storage Broker Systems
    Torkura, Kennedy A.
    Sukmana, Muhammad I. H.
    Strauss, Tim
    Graupner, Hendrik
    Cheng, Feng
    Meinel, Christoph
    2018 IEEE 17TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2018,
  • [42] UniDRM: Unified Data and Resource Management for Federated Vehicular Cloud Computing
    Danquah, Wiseborn M.
    Altilar, D. Turgay
    IEEE Access, 2021, 9 : 157052 - 157067
  • [43] Incident Detection over Unified Threat Management Platform on a Cloud Network
    Saad, Muhammad Muneeb
    Iqbal, Talha
    Ali, Hazrat
    Bulbul, Mohammad Farhad
    Khan, Shahid
    Tanougast, Camel
    PROCEEDINGS OF THE 2019 10TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS - TECHNOLOGY AND APPLICATIONS (IDAACS), VOL. 2, 2019, : 592 - 596
  • [44] CYCLONE Unified Deployment and Management of Federated, Multi-Cloud Applications
    Slawik, Mathias
    Zilci, Beguem Ilke
    Demchenko, Yuri
    Aznar Baranda, Jose Ignacio
    Branchat, Robert
    Loomis, Charles
    Lodygensky, Oleg
    Blanchet, Christophe
    2015 IEEE/ACM 8TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC), 2015, : 453 - 457
  • [45] Understanding and Optimizing Workloads for Unified Resource Management in Large Cloud Platforms
    Lu, Chengzhi
    Xu, Huanle
    Ye, Kejiang
    Xu, Guoyao
    Zhang, Liping
    Yang, Guodong
    Xu, Chengzhong
    PROCEEDINGS OF THE EIGHTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, EUROSYS 2023, 2023, : 416 - 432
  • [46] Event-driven approach for predictive and proactive management of SLA violations in the Cloud of Things
    Nawaz, Falak
    Janjua, Naeem Khalid
    Hussain, Omar Khadeer
    Hussain, Farookh Khadeer
    Chang, Elizabeth
    Saberi, Morteza
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 84 : 78 - 97
  • [47] Digital Dashboard for College Management Business Intelligence Analysis with ROI Method
    Sar'i, Muhammad
    Maulani, Giandari
    Rahardja, Celine
    Meria, Lista
    Supriayanti, Dedeh
    Agung, Agung
    2022 IEEE Creative Communication and Innovative Technology, ICCIT 2022, 2022,
  • [48] Cloud Service-Oriented Dashboard for Work Cell Management in RFID-enabled Ubiquitous Manufacturing
    Cheng, Meng
    Zhong, Ray Y.
    Li, Yuanyuan
    Luo, Hao
    Lan, Shulin
    Huang, George Q.
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2013, : 379 - 382
  • [49] Proactive management of SLA violations by capturing relevant external events in a Cloud of Things environment
    Nawaz, Falak
    Hussain, Omar
    Hussain, Farookh Khadeer
    Janjua, Naeem Khalid
    Saberi, Morteza
    Chang, Elizabeth
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 95 : 26 - 44
  • [50] Using Edge-to-Cloud Analytics IoT Dumpsite Monitor for Proactive Waste Management
    Mbonu, E. S.
    Okafor, K. C.
    Chukwudebe, G. A.
    Ikerionwu, C. O.
    Amadi, E. C.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT IX, 2021, 12957 : 465 - 480