Towards a Signature Based Compression Technique for Big Data Storage

被引:0
|
作者
Costa, Constantinos [1 ]
Chrysanthis, Panos K. [1 ]
Costa, Marios [1 ]
Stavrakis, Efstathios [2 ]
Nicolaou, Nicolas [2 ]
机构
[1] Rinnoco Ltd, Limassol, Cyprus
[2] Algolysis Ltd, Limassol, Cyprus
来源
2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW | 2023年
关键词
signature based; compression; column stores; hybrid store;
D O I
10.1109/ICDEW58674.2023.00022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the volume of stored data doubles every year, storage capacity costs decline only at a rate of less than 1/5 per year. At the same time, data is stored in multiple physical locations and remotely retrieved from multiple sites. Thus, minimizing data storage costs while maintaining data fidelity and efficient retrieval is still a key challenge in database systems. In addition to the raw big data, its associated metadata and indexes equally demand tremendous storage that impacts the I/O footprint of data centers. In this vision paper, we propose a new signature-based compression (SIBACO) technique that is able to: (i) incrementally store big data in an efficient way; and (ii) improve the retrieval time for data-intensive applications. SIBACO achieves higher compression ratios by combining and compressing columns differently based on the type and distribution of data and can be easily integrated with column and hybrid stores. We evaluate our proposed tool using real datasets showing that SIBACO outperforms "monolithic" compression schemes in terms of storage cost.
引用
收藏
页码:100 / 104
页数:5
相关论文
共 50 条
  • [31] A Big Data Storage Scheme Based on Distributed Storage Locations and Multiple Authorizations
    Al-Odat, Zeyad A.
    Al-Qtiemat, Eman M.
    Khan, Samee U.
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 13 - 18
  • [32] Towards higher efficiency in a distributed memory storage system using data compression
    Yu, Xiaoyang
    Lu, Songfeng
    Wang, Tongyang
    Zhang, Xinfang
    Wan, Shaohua
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2022, 20 (04) : 232 - 240
  • [33] HARDWARE-BASED DATA COMPRESSION TECHNIQUE.
    Flores, A.V.
    1600, (27):
  • [34] Lossless Data Compression Technique With Encryption Based Approach
    Sharma, Kornai
    Gupta, Kunal
    2017 8TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2017,
  • [35] PCA based compression technique for the BOOTES image data
    Páta, P
    Vítek, S
    Bernas, M
    Castro-Tirado, AJ
    PROCEEDINGS OF THE 5TH INTEGRAL WORKSHOP ON THE INTEGRAL UNIVERSE, 2004, 552 : 883 - 886
  • [36] On the Research of Big Data Storage
    Qin, H. F.
    Qian, Z. M.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY AND MANAGEMENT SCIENCE (ITMS 2015), 2015, 34 : 1410 - 1413
  • [37] Cognitive Storage for Big Data
    Cherubini, Giovanni
    Jelitto, Jens
    Venkatesan, Vinodh
    COMPUTER, 2016, 49 (04) : 43 - 51
  • [38] MapReduce-based storage and indexing for big health data
    Gayathiri, N. R.
    Natarajan, A. M.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
  • [39] Research on Big Data Management and Storage Based on Linux Container
    Yang, Jing
    PROCEEDINGS OF THE 2016 6TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS, ENVIRONMENT, BIOTECHNOLOGY AND COMPUTER (MMEBC), 2016, 88 : 495 - 500
  • [40] Issues of concern in Storage system of IoT Based Big Data
    Papalkar, Rahul R.
    Nerkar, Pravin R.
    Dhote, C. A.
    2017 IEEE INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION, INSTRUMENTATION AND CONTROL (ICICIC), 2017,