Replica-aware data recovery performance improvement for Hadoop system with NVM

被引:0
|
作者
Li, Xin [1 ]
Li, Huijie [1 ]
Lu, Youyou [2 ]
Zhao, Yanchao [1 ]
Qin, Xiaolin [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Data recovery; HDFS; MapReduce; Non-volatile memory; Performance tuning; CLUSTER; MEMORY;
D O I
10.1007/s42514-021-00066-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The non-volatile memory (NVM) is the promising device to store data and accelerate big data analysis due to its excellent I/O performance. However, we find that simply replacing hard disk drive (HDD) with NVM cannot bring the expected performance improvement. In this paper, we take the data recovery issue in Hadoop file system (HDFS) as a case study to investigate how to take advantage of the performance of NVM. We analyze the data recovery mechanism in HDFS and find that the configuration of replication tasks in the DataNode can affect the data recovery significantly. We conduct extensive analysis and experiments tuning the configuration and also get some interesting findings. With the new configuration, we increase the data recovery performance from 17 to 71%. We can also improve the execution performance of MapReduce jobs from 28 to 59% through optimized configuration. We also find that the sudden data recovery brings disordered network resource competition, which reduces the performance of MapReduce jobs. Hence, We present a priority-aware multi-stage data recovery method. This improves the performance by 32.5% in addition for the MapReduce jobs.
引用
收藏
页码:144 / 156
页数:13
相关论文
共 50 条
  • [41] Development and composition of a data center heat recovery system and evaluation of annual operation performance
    Huang, Qionghai
    Shao, Shuangquan
    Zhang, Hainan
    Tian, Changqing
    ENERGY, 2019, 189
  • [42] Performance of a multi-user OCDMA system demonstrator with full clock and data recovery
    Faucher, J
    Adams, R
    Chen, LR
    Plant, DV
    2005 CONFERENCE ON LASERS & ELECTRO-OPTICS (CLEO), VOLS 1-3, 2005, : 1204 - 1206
  • [43] Marcher: A Heterogeneous System Supporting Energy-Aware High Performance Computing and Big Data Analytics
    Zong, Ziliang
    Ge, Rong
    Gu, Qijun
    BIG DATA RESEARCH, 2017, 8 : 27 - 38
  • [44] Performance of a Hybrid ED-NF Membrane System for Water Recovery Improvement via NOM Fouling Control
    Kum, Soyoon
    Landsman, Matthew R.
    Su, Gregory M.
    Freychet, Guillaume
    Lawler, Desmond F.
    Katz, Lynn E.
    ACS ES&T ENGINEERING, 2021, 1 (10): : 1420 - 1431
  • [45] Performance improvement of a heat recovery system combined with fuel cell and thermoelectric generator: 4E analysis
    Musharavati, Farayi
    Khanmohammadi, Shoaib
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2022, 47 (62) : 26701 - 26714
  • [46] Performance Assessment of Monthly Ensemble Prediction Data Based on Improvement of Climate Prediction System at KMA
    Ham, Hyunjun
    Lee, Sang-Min
    Hyun, Yu-Kyug
    Kim, Yoonjae
    ATMOSPHERE-KOREA, 2019, 29 (02): : 149 - 164
  • [47] System identification based on quantized I/O data corrupted with noises and its performance improvement
    Suzuki, Hiromi
    Sugie, Tosbiharu
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 3684 - 3689
  • [48] Performance Analysis of Lake Water Cooling Coupled with a Waste Heat Recovery System in the Data Center
    Yin, Peng
    Guo, Yang
    Zhang, Man
    Wang, Jiaqiang
    Zhang, Linfeng
    Feng, Da
    Ding, Weike
    SUSTAINABILITY, 2024, 16 (15)
  • [49] Improving Large-scale Storage System Performance via Topology-aware and Balanced Data Placement
    Wang, Feiyi
    Oral, Sarp
    Gupta, Saurabh
    Tiwari, Devesh
    Vazhkudai, Sudharshan S.
    2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 656 - 663
  • [50] Energy Performance Study of a Data Center Combined Cooling System Integrated with Heat Storage and Waste Heat Recovery System
    Zhou, Chaohui
    Hu, Yue
    Liu, Rujie
    Liu, Yuce
    Wang, Meng
    Luo, Huiheng
    Tian, Zhiyong
    BUILDINGS, 2025, 15 (03)