Deplump for Streaming Data

被引:0
|
作者
Bartlett, Nicholas [1 ]
Wood, Frank [1 ]
机构
[1] Columbia Univ, Dept Stat, New York, NY 10027 USA
关键词
D O I
10.1109/DCC.2011.43
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a general-purpose, lossless compressor for streaming data. This compressor is based on the deplump probabilistic compressor for batch data. Approximations to the inference procedure used in the probabilistic model underpinning deplump are introduced that yield the computational asyptotics necessary for stream compression. We demonstrate the performance of this streaming deplump variant relative to the batch compressor on a benchmark corpus and find that it performs equivalently well despite these approximations. We also explore the performance of the streaming variant on corpora that are too large to be compressed by batch deplump and demonstrate excellent compression performance.
引用
收藏
页码:363 / 372
页数:10
相关论文
共 50 条
  • [21] Adaptive Preprocessing for Streaming Data
    Zliobaite, Indre
    Gabrys, Bogdan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 309 - 321
  • [22] Aggregatably Verifiable Data Streaming
    Miao, Meixia
    Zhao, Siqi
    Li, Jiawei
    Wei, Jianghong
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (13): : 24109 - 24122
  • [23] Architecture for Analysis of Streaming Data
    Hoque, Sheik
    Miranskyy, Andriy
    2018 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E 2018), 2018, : 263 - 269
  • [24] Streaming Methods in Data Analysis
    Cormode, Graham
    DATA SCIENCE, 2015, 9147 : 3 - 6
  • [25] Streaming PCA for Markovian Data
    Kumar, Syamantak
    Sarkar, Purnamrita
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [26] Unsupervised clustering in streaming data
    Tasoulis, Dimitris K.
    Adams, Niall M.
    Hand, David J.
    ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 638 - +
  • [27] Seeing CAD with streaming data
    2001, Cahner Publishing Co. (56):
  • [28] Enhancement of Data Streaming in Clustering for Uncertain Data
    Ganatra, Jeny
    Thacker, Chintan
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING, 2018, 671 : 155 - 162
  • [29] Data Streaming with Affinity Propagation
    Zhang, Xiangliang
    Furtlehner, Cyril
    Sebag, Michele
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART II, PROCEEDINGS, 2008, 5212 : 628 - 643
  • [30] The streaming data management challenge
    Juniper, SK
    Shepherd, K
    Wallace, K
    OCEANS 2001 MTS/IEEE: AN OCEAN ODYSSEY, VOLS 1-4, CONFERENCE PROCEEDINGS, 2001, : 2297 - 2301