Deplump for Streaming Data

被引:0
|
作者
Bartlett, Nicholas [1 ]
Wood, Frank [1 ]
机构
[1] Columbia Univ, Dept Stat, New York, NY 10027 USA
关键词
D O I
10.1109/DCC.2011.43
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a general-purpose, lossless compressor for streaming data. This compressor is based on the deplump probabilistic compressor for batch data. Approximations to the inference procedure used in the probabilistic model underpinning deplump are introduced that yield the computational asyptotics necessary for stream compression. We demonstrate the performance of this streaming deplump variant relative to the batch compressor on a benchmark corpus and find that it performs equivalently well despite these approximations. We also explore the performance of the streaming variant on corpora that are too large to be compressed by batch deplump and demonstrate excellent compression performance.
引用
收藏
页码:363 / 372
页数:10
相关论文
共 50 条
  • [1] Streaming data
    Szewczyk, William
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2011, 3 (01): : 22 - 29
  • [2] Data Streaming 2.0
    Wright, Alex
    COMMUNICATIONS OF THE ACM, 2010, 53 (04) : 13 - 14
  • [3] DEA with streaming data
    Dula, J. H.
    Lopez, F. J.
    OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2013, 41 (01): : 41 - 47
  • [4] Recommendations For Streaming Data
    Subbian, Karthik
    Aggarwal, Charu
    Hegde, Kshiteesh
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2185 - 2190
  • [5] General data streaming
    Miller, FW
    Keleher, P
    Tripathi, SK
    19TH IEEE REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 1998, : 232 - 241
  • [6] Streaming Data Classification
    Annapoorna, Srilakshmi P., V
    Mirnalinee, T. T.
    2016 5TH INTERNATIONAL CONFERENCE ON RECENT TRENDS IN INFORMATION TECHNOLOGY (ICRTIT), 2016,
  • [7] Data Streaming for Appliances
    Patino, Marta
    Azqueta, Ainhoa
    CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2019, : 672 - 678
  • [8] Statistical Data Reduction for Streaming Data
    Wu, Kesheng
    Lee, Dongeun
    Sim, Alex
    Choi, Jaesik
    2017 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2017,
  • [9] Streaming Tables: Native Support to Streaming Data in DBMSs
    Carafoli, Luca
    Mandreoli, Federica
    Martoglia, Riccardo
    Penzo, Wilma
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 47 (10): : 2768 - 2782
  • [10] Mining streaming emerging patterns from streaming data
    Alhammady, Hamad
    2007 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2007, : 432 - 436