Towards a Signature Based Compression Technique for Big Data Storage

被引:0
|
作者
Costa, Constantinos [1 ]
Chrysanthis, Panos K. [1 ]
Costa, Marios [1 ]
Stavrakis, Efstathios [2 ]
Nicolaou, Nicolas [2 ]
机构
[1] Rinnoco Ltd, Limassol, Cyprus
[2] Algolysis Ltd, Limassol, Cyprus
来源
2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW | 2023年
关键词
signature based; compression; column stores; hybrid store;
D O I
10.1109/ICDEW58674.2023.00022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the volume of stored data doubles every year, storage capacity costs decline only at a rate of less than 1/5 per year. At the same time, data is stored in multiple physical locations and remotely retrieved from multiple sites. Thus, minimizing data storage costs while maintaining data fidelity and efficient retrieval is still a key challenge in database systems. In addition to the raw big data, its associated metadata and indexes equally demand tremendous storage that impacts the I/O footprint of data centers. In this vision paper, we propose a new signature-based compression (SIBACO) technique that is able to: (i) incrementally store big data in an efficient way; and (ii) improve the retrieval time for data-intensive applications. SIBACO achieves higher compression ratios by combining and compressing columns differently based on the type and distribution of data and can be easily integrated with column and hybrid stores. We evaluate our proposed tool using real datasets showing that SIBACO outperforms "monolithic" compression schemes in terms of storage cost.
引用
收藏
页码:100 / 104
页数:5
相关论文
共 50 条
  • [21] Review of Big Data Storage based on DNA Computing
    Hakami, Hanadi Ahmed
    Chaczko, Zenon
    Kale, Anup
    2015 ASIA-PACIFIC CONFERENCE ON COMPUTER-AIDED SYSTEM ENGINEERING - APCASE 2015, 2015, : 113 - 117
  • [22] Study on Cloud Storage based on the MapReduce for Big Data
    Huang Yi
    Ma Xinqiang
    Zhang Yongdan
    Liu Youyuan
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON MECHATRONICS, ELECTRONIC, INDUSTRIAL AND CONTROL ENGINEERING, 2015, 8 : 1601 - 1605
  • [23] Data compression and storage into an EEG multichannel system using wavelets technique
    Ponomaryov, V
    Badillo, L
    Juarez, C
    Sanchez, JL
    Igartua, L
    Sanchez, JC
    MEDICAL IMAGING 2002: PACS AND INTEGRATED MEDICAL INFORMATION SYSTEMS: DESIGN AND EVALUATION, 2002, 4685 : 430 - 437
  • [24] Query of Marine Big Data Based on Graph Compression and Views
    Zhao, Danfeng
    Zhang, Yeyi
    Lin, Junchen
    Song, Wei
    Liotta, Antonio
    Huang, Dongmei
    2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 252 - 257
  • [25] Content-Based Textual Big Data Analysis and Compression
    Gao, Fei
    Dutta, Ananya
    Liu, Jiangjiang
    2018 INTERNATIONAL CONFERENCE ON COMPUTING AND BIG DATA (ICCBD 2018), 2018, : 7 - 12
  • [26] Identity-Based Dynamic Data Auditing for Big Data Storage
    Shang, Tao
    Zhang, Feng
    Chen, Xingyue
    Liu, Jianwei
    Lu, Xinxi
    IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (06) : 913 - 921
  • [27] A Novel Scalable Signature Based Subspace Clustering Approach for Big Data
    Gayathri, T.
    Bhaskari, D. Lalitha
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2019, 14 (02) : 41 - 51
  • [28] Towards Lightweight and Swift Storage Resource Management in Big Data Cloud Era
    Zhou, Ruijin
    Chen, Huixiang
    Li, Tao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 133 - 142
  • [29] Fast Anti-noise Compression Storage Algorithm for Big Data Video Images
    Lei, Tao
    ADVANCED HYBRID INFORMATION PROCESSING, ADHIP 2019, PT II, 2019, 302 : 355 - 362
  • [30] Abnormal Behavior Detection Technique Based on Big Data
    Kim, Hyunjoo
    Kim, Ikkyun
    Chung, Tai-Myoung
    FRONTIER AND INNOVATION IN FUTURE COMPUTING AND COMMUNICATIONS, 2014, 301 : 553 - 563