Estimating Lossy Compressibility of Scientific Data Using Deep Neural Networks

被引：4

作者：

Qin, Zhenlu ^{[1
]}

Wang, Jinzhen ^{[1
]}

Liu, Qing ^{[1
]}

Chen, Jieyang ^{[2
]}

Pugmire, Dave ^{[2
]}

Podhorszki, Norbert ^{[2
]}

Klasky, Scott ^{[2
]}

机构：

[1] New Jersey Institute of Technology, Newark,NJ,07102, United States

[2] Oak Ridge National Laboratory, Oak Ridge,TN,37830, United States

来源：

IEEE Letters of the Computer Society | 2020年 / 3卷 / 01期

关键词：

Compressors - Data reduction - Compressibility;

D O I：

10.1109/LOCS.2020.2971940

中图分类号：

学科分类号：

摘要：

Simulation based scientific applications generate increasingly large amounts of data on high-performance computing (HPC) systems. To allow data to be stored and analyzed efficiently, data compression is often utilized to reduce the volume and velocity of data. However, a question often raised by domain scientists is the level of compression that can be expected so that they can make more informed decisions, balancing between accuracy and performance. In this letter, we propose a deep neural network based approach for estimating the compressibility of scientific data. To train the neural network, we build both general features as well as compressor-specific features so that the characteristics of both data and lossy compressors are captured in training. Our approach is demonstrated to outperform a prior analytical model as well as a sampling based approach in the case of a biased estimation, i.e., for SZ. However, for the unbiased estimation (i.e., ZFP), the sampling based approach yields the best accuracy, despite the high overhead involved in sampling the target dataset. © 2018 IEEE.

引用

页码：5 / 8

共 50 条

[11] Using Neural Networks for Two Dimensional Scientific Data Compression
Hayne, Lucas
Clyne, John
Li, Shaomeng
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2956 - 2965
[12] Estimating Information Flow in Deep Neural Networks
Goldfeld, Ziv
van den Berg, Ewout
Greenewald, Kristjan
Melnyk, Igor
Nguyen, Nam
Kingsbury, Brian
Polyanskiy, Yury
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[13] Exploring Lossy Compressibility through Statistical Correlations of Scientific Datasets
Krasowska, David
Bessac, Julie
Underwood, Robert
Calhoun, Jon C.
Di, Sheng
Cappello, Franck
PROCEEDINGS OF THE 7TH INTERNATIONAL WORKSHOP ON DATA ANALYSIS AND REDUCTION FOR BIG SCIENTIFIC DATA (DRBSD-7), 2021, : 47 - 53
[14] Estimating the all-terminal signatures for networks by using deep neural network
Da, Gaofeng
Zhang, Xin
He, Zhenwen
Ding, Weiyong
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2025, 253
[15] DAVINZ: Data Valuation using Deep Neural Networks at Initialization
Wu, Zhaoxuan
Shu, Yao
Low, Bryan Kian Hsiang
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[16] Hyperspectral Data Classification using Deep Convolutional Neural Networks
Salman, Mesut
Yuksel, Seniha Esen
2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 2129 - 2132
[17] Traffic Data Imputation Using Deep Convolutional Neural Networks
Benkraouda, Ouafa
Thodi, Bilal Thonnam
Yeo, Hwasoo
Menendez, Monica
Jabari, Saif Eddin
IEEE ACCESS, 2020, 8 (08): : 104740 - 104752
[18] DEEP NEURAL NETWORKS FOR ESTIMATING SPEECH MODEL ACTIVATIONS
Williamson, Donald S.
Wang, Yuxuan
Wang, DeLiang
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5113 - 5117
[19] DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
Jin, Sian
Di, Sheng
Liang, Xin
Tian, Jiannan
Tao, Dingwen
Cappello, Franck
HPDC'19: PROCEEDINGS OF THE 28TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2019, : 159 - 170
[20] Scientific Visualization Using Neural Networks
Shen, Han-Wei
COMPUTER, 2022, 55 (07) : 4 - 6

← 1 2 3 4 5 →