Estimating Lossy Compressibility of Scientific Data Using Deep Neural Networks

被引:4
|
作者
Qin, Zhenlu [1 ]
Wang, Jinzhen [1 ]
Liu, Qing [1 ]
Chen, Jieyang [2 ]
Pugmire, Dave [2 ]
Podhorszki, Norbert [2 ]
Klasky, Scott [2 ]
机构
[1] New Jersey Institute of Technology, Newark,NJ,07102, United States
[2] Oak Ridge National Laboratory, Oak Ridge,TN,37830, United States
来源
IEEE Letters of the Computer Society | 2020年 / 3卷 / 01期
关键词
Compressors - Data reduction - Compressibility;
D O I
10.1109/LOCS.2020.2971940
中图分类号
学科分类号
摘要
Simulation based scientific applications generate increasingly large amounts of data on high-performance computing (HPC) systems. To allow data to be stored and analyzed efficiently, data compression is often utilized to reduce the volume and velocity of data. However, a question often raised by domain scientists is the level of compression that can be expected so that they can make more informed decisions, balancing between accuracy and performance. In this letter, we propose a deep neural network based approach for estimating the compressibility of scientific data. To train the neural network, we build both general features as well as compressor-specific features so that the characteristics of both data and lossy compressors are captured in training. Our approach is demonstrated to outperform a prior analytical model as well as a sampling based approach in the case of a biased estimation, i.e., for SZ. However, for the unbiased estimation (i.e., ZFP), the sampling based approach yields the best accuracy, despite the high overhead involved in sampling the target dataset. © 2018 IEEE.
引用
收藏
页码:5 / 8
相关论文
共 50 条
  • [11] Using Neural Networks for Two Dimensional Scientific Data Compression
    Hayne, Lucas
    Clyne, John
    Li, Shaomeng
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2956 - 2965
  • [12] Estimating Information Flow in Deep Neural Networks
    Goldfeld, Ziv
    van den Berg, Ewout
    Greenewald, Kristjan
    Melnyk, Igor
    Nguyen, Nam
    Kingsbury, Brian
    Polyanskiy, Yury
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [13] Exploring Lossy Compressibility through Statistical Correlations of Scientific Datasets
    Krasowska, David
    Bessac, Julie
    Underwood, Robert
    Calhoun, Jon C.
    Di, Sheng
    Cappello, Franck
    PROCEEDINGS OF THE 7TH INTERNATIONAL WORKSHOP ON DATA ANALYSIS AND REDUCTION FOR BIG SCIENTIFIC DATA (DRBSD-7), 2021, : 47 - 53
  • [14] Estimating the all-terminal signatures for networks by using deep neural network
    Da, Gaofeng
    Zhang, Xin
    He, Zhenwen
    Ding, Weiyong
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2025, 253
  • [15] DAVINZ: Data Valuation using Deep Neural Networks at Initialization
    Wu, Zhaoxuan
    Shu, Yao
    Low, Bryan Kian Hsiang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [16] Hyperspectral Data Classification using Deep Convolutional Neural Networks
    Salman, Mesut
    Yuksel, Seniha Esen
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 2129 - 2132
  • [17] Traffic Data Imputation Using Deep Convolutional Neural Networks
    Benkraouda, Ouafa
    Thodi, Bilal Thonnam
    Yeo, Hwasoo
    Menendez, Monica
    Jabari, Saif Eddin
    IEEE ACCESS, 2020, 8 (08): : 104740 - 104752
  • [18] DEEP NEURAL NETWORKS FOR ESTIMATING SPEECH MODEL ACTIVATIONS
    Williamson, Donald S.
    Wang, Yuxuan
    Wang, DeLiang
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5113 - 5117
  • [19] DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
    Jin, Sian
    Di, Sheng
    Liang, Xin
    Tian, Jiannan
    Tao, Dingwen
    Cappello, Franck
    HPDC'19: PROCEEDINGS OF THE 28TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2019, : 159 - 170
  • [20] Scientific Visualization Using Neural Networks
    Shen, Han-Wei
    COMPUTER, 2022, 55 (07) : 4 - 6