Design and implementation of a Bloom filter-based data deduplication algorithm for efficient data management

被引:2
|
作者
Jang Y.-H. [1 ]
Lee N.-U. [1 ]
Kim H.-J. [1 ]
Park S.-C. [1 ]
机构
[1] Department of IT Convergence Engineering, Gachon University, Seongnam
关键词
Backup data; Bloom filters; Fast identification; Hash value; Removing duplicate data; Source-based deduplication;
D O I
10.1007/s12652-018-0893-1
中图分类号
学科分类号
摘要
Recently, the amount of data being stored has increased dramatically, and the amount of data backed up on servers increases yearly. However, the share of duplicate data in that backup data is also increasing, and because of this, the time spent on duplicate data processing is greatly increasing. Therefore, in this paper, we design a Bloom filter-based data deduplication algorithm for fast identification and removal of duplicate data. The results from evaluation of the implemented algorithm show that execution time is 17% less than with an existing deduplication algorithm. © Springer-Verlag GmbH Germany, part of Springer Nature 2018.
引用
收藏
页码:1387 / 1393
页数:6
相关论文
共 50 条
  • [1] A Bloom Filter-Based Data Deduplication for Big Data
    Podder, Shrayasi
    Mukherjee, S.
    ADVANCES IN DATA AND INFORMATION SCIENCES, VOL 1, 2018, 38 : 161 - 168
  • [2] Bloom filter-based efficient broadcast algorithm for the Internet of things
    Talpur, Anum
    Shaikh, Faisal K.
    Newe, Thomas
    Sheikh, Adil A.
    Felemban, Emad
    Khelil, Abdelmajid
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2017, 13 (12)
  • [3] Towards Bloom Filter-based Indexing of Iris Biometric Data
    Rathgeb, C.
    Baier, H.
    Busch, C.
    Breitinger, F.
    2015 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2015, : 422 - 429
  • [4] An Efficient Service Discovery Algorithm for Counting Bloom Filter-Based Service Registry
    Cheng, Shuxing
    Chang, Carl K.
    Zhang, Liang-Jie
    2009 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, VOLS 1 AND 2, 2009, : 157 - +
  • [6] BloomStore: Bloom-Filter based Memory-efficient Key-Value Store for Indexing of Data Deduplication on Flash
    Lu, Guanlin
    Nam, Young Jin
    Du, David H. C.
    2012 IEEE 28TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST), 2012,
  • [7] Bloom Filter-Based Scalable Multicast: Methodology, Design and Application
    Tian, Xiaohua
    Cheng, Yu
    IEEE NETWORK, 2013, 27 (06): : 89 - 94
  • [8] Feature Selection in High Dimensional Data by a Filter-Based Genetic Algorithm
    De Stefano, Claudio
    Fontanella, Francesco
    di Freca, Alessandra Scotto
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2017, PT I, 2017, 10199 : 506 - 521
  • [9] Test data generation with a Kalman filter-based adaptive genetic algorithm
    Aleti, Aldeida
    Grunske, Lars
    JOURNAL OF SYSTEMS AND SOFTWARE, 2015, 103 : 343 - 352
  • [10] Bloom Filter based Data Collection Algorithm for Wireless Sensor Networks
    Talpur, Anum
    Newe, Thomas
    Shaikh, Faisal K.
    Sheikh, Adil A.
    Felemban, Emad
    Khelil, Abdelmajid
    2017 31ST INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN), 2017, : 354 - 359