Towards a Signature Based Compression Technique for Big Data Storage

被引:0
|
作者
Costa, Constantinos [1 ]
Chrysanthis, Panos K. [1 ]
Costa, Marios [1 ]
Stavrakis, Efstathios [2 ]
Nicolaou, Nicolas [2 ]
机构
[1] Rinnoco Ltd, Limassol, Cyprus
[2] Algolysis Ltd, Limassol, Cyprus
来源
2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW | 2023年
关键词
signature based; compression; column stores; hybrid store;
D O I
10.1109/ICDEW58674.2023.00022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the volume of stored data doubles every year, storage capacity costs decline only at a rate of less than 1/5 per year. At the same time, data is stored in multiple physical locations and remotely retrieved from multiple sites. Thus, minimizing data storage costs while maintaining data fidelity and efficient retrieval is still a key challenge in database systems. In addition to the raw big data, its associated metadata and indexes equally demand tremendous storage that impacts the I/O footprint of data centers. In this vision paper, we propose a new signature-based compression (SIBACO) technique that is able to: (i) incrementally store big data in an efficient way; and (ii) improve the retrieval time for data-intensive applications. SIBACO achieves higher compression ratios by combining and compressing columns differently based on the type and distribution of data and can be easily integrated with column and hybrid stores. We evaluate our proposed tool using real datasets showing that SIBACO outperforms "monolithic" compression schemes in terms of storage cost.
引用
收藏
页码:100 / 104
页数:5
相关论文
共 50 条
  • [41] Research and Implementation of Distributed Storage System Based on Big Data
    Ma, Ke
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2016, : 168 - 171
  • [42] Research on Big Data Security Storage Based on Compressed Sensing
    Lv, Denglong
    Zhu, Shibing
    Liu, Ran
    IEEE ACCESS, 2019, 7 : 3810 - 3825
  • [43] Analytics towards big data
    State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing
    100876, China
    不详
    100876, China
    不详
    100876, China
    Beijing Youdian Daxue Xuebao, 3 (1-12):
  • [44] Data Compression in Big Graph Warehouse
    Polyakov I.V.
    Chepovskiy A.A.
    Chepovskiy A.M.
    Journal of Mathematical Sciences, 2020, 245 (2) : 197 - 201
  • [45] Differential Evolution based bucket indexed data deduplication for big data storage
    Kumar, Naresh
    Antwal, Shobha
    Jain, S. C.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (01) : 491 - 505
  • [46] Research on Key Technology for Data Storage in Smart Community Based on Big Data
    Yan, Hui
    Long, Duo
    2015 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA AND SMART CITY (ICITBS), 2016, : 653 - 656
  • [47] Big data compression processing and verification based on Hive for smart substation
    Qu, Zhijian
    Chen, Ge
    JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, 2015, 3 (03) : 440 - 446
  • [48] A spatiotemporal compression based approach for efficient big data processing on Cloud
    Yang, Chi
    Zhang, Xuyun
    Zhong, Changmin
    Liu, Chang
    Pei, Jian
    Ramamohanarao, Kotagiri
    Chen, Jinjun
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2014, 80 (08) : 1563 - 1583
  • [49] Towards Optimal Sensitivity-Based Anonymization for Big Data
    Al-Zobbi, Mohammed
    Shahrestani, Seyed
    Ruan, Chun
    2017 27TH INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), 2017, : 331 - 336
  • [50] Routing Optimization Algorithms Based on Node Compression in Big Data Environment
    Yang, Lifeng
    Chen, Liangming
    Wang, Ningwei
    Liao, Zhifang
    SCIENTIFIC PROGRAMMING, 2017, 2017