Optimizing I/O Performance Through Effective vCPU Scheduling Interference Management

被引：0

作者：

Wang, Liang ^{[1
]}

Yang, Jinzhe ^{[2
]}

Zhai, Jidong ^{[1
]}

Yang, Guangwen ^{[1
,3
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

[2] Imperial Coll London, TC Technol, London SW7 2BX, England

[3] Zhejiang Lab, Hangzhou 311121, Zhejiang, Peoples R China

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2024年 / 35卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Interference; Cloud computing; Dynamic scheduling; Production; Task analysis; Processor scheduling; Performance evaluation; Virtualization; cloud computing; vCPU scheduling; I/O performance; interference diagnosis;

D O I：

10.1109/TPDS.2023.3329298

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Virtual machines (VMs) heavily rely on virtual CPUs (vCPUs) scheduling to achieve efficient I/O performance. The vCPU scheduling interference can cause inconsistent scheduling latency and degraded I/O performance, potentially compromising the services provided by affected VMs. Existing solutions have limitations, such as inefficiency in diagnosing interference issues or imposing undesired side effects on cloud systems. To address these challenges, we present Otter, a holistic technique for optimizing I/O performance in the presence of vCPU scheduling interference. Otter employs innovative methods to enhance interference diagnosis efficiency. First, we propose lightweight methods to measure the dynamic changes in scheduling latencies for co-running vCPUs, ensuring both flexibility and accuracy. Second, we propose fine-grained quantification methods to timely determine the interference, with low false positive and false negative rates. Third, we identify interference patterns that aid in analyzing the root causes of interference and preventing similar issues from recurring. Otter has been operational for one year in the production cloud at the National Supercomputing Center (Wuxi). It diagnoses and helps fix more than 470 vCPU scheduling interference-related issues, resulting in a 19.6% improvement in cloud service I/O performance with negligible overhead in production.

引用

页码：2315 / 2330

页数：16

共 50 条

[21] Optimizing the Java']Java piped I/O stream library for performance
Zhang, J
Lee, JJ
McKinley, PK
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2005, 2481 : 233 - 248
[22] Optimizing Special Educator Wellness and Job Performance Through Stress Management
Ansley, Brandis M.
Houchins, David
Varjas, Kris
TEACHING EXCEPTIONAL CHILDREN, 2016, 48 (04) : 176 - 185
[23] Scaling Parallel I/O Performance through I/O Delegate and Caching System
Nisar, Arifa
Liao, Wei-keng
Choudhary, Alok
INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2008, : 487 - 498
[24] I/O scheduling and performance analysis on multi-core platforms
Liu, Zhaobin
Qu, Wenyu
Li, Haitao
Ruan, Min
Zhou, Wanlei
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2009, 21 (10): : 1405 - 1417
[25] Random I/O performance of buffer scheduling algorithm for tape library
Wu, Tao
Yang, Jie
Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2007, 35 (05): : 75 - 80
[26] Hierarchical Collective I/O Scheduling for High-Performance Computing
Liu, Jialin
Zhuang, Yu
Chen, Yong
BIG DATA RESEARCH, 2015, 2 (03) : 117 - 126
[27] Efficient I/O Performance-Focused Scheduling in High-Performance Computing
Kim, Soeun
Kim, Sunggon
Kim, Hwajung
APPLIED SCIENCES-BASEL, 2024, 14 (21):
[28] Effective Scheduling Algorithms for I/O Blocking with a Multi-Frame Task Model
Ding, Shan
Tomiyama, Hiroyuki
Takada, Hiroaki
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (07): : 1412 - 1420
[29] Engaging employees through effective performance management: an empirical examination
Kakkar, Shiva
Dash, Sanket
Vohra, Neharika
Saha, Surajit
BENCHMARKING-AN INTERNATIONAL JOURNAL, 2020, 27 (05) : 1843 - 1860
[30] Optimizing parallel I/O performance in NVMe SSDs by Dynamic cache partitioning
Li, Zecheng
Yin, Shu
Ruan, Xiaojun
PERFORMANCE EVALUATION, 2025, 168

← 1 2 3 4 5 →