A cluster-based data deduplication technology

被引：1

作者：

Tseng, Chuan-Mu ^{[1
]}

Ciou, Jheng-Rong ^{[2
]}

Liu, Tzong-Jye ^{[2
]}

机构：

[1] Jeh Teh Jr Coll Med Nursing & Management, Dept Appl Digital Media, Miaoli, Taiwan

[2] Feng Chia Univ, Dept Informat Engn & Comp Sci, Taichung, Taiwan

来源：

2014 SECOND INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) | 2014年

关键词：

Bloom filter; cluster; data deduplication;

D O I：

10.1109/CANDAR.2014.22

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Data deduplication technology usually identifies redundant data quickly and correctly by using bloom filter technology. A bloom filter can determine whether there is redundant data. However, there are the presences of false positives. In order to avoid false positives, we need to compare a new chunk with chunks that have been stored. In order to reduce the time to exclude the bloom filter false positives, current research uses many small size index tables to store chunk ID. However, the target chunk ID only stores in one index table. Searching for the target chunk ID at another index table uselessly took a great deal of time. In this paper, we cluster the stored chunks to reduce the time of excluding the false positive problem induced by bloom filter.

引用

页码：226 / 230

页数：5

共 50 条

[21] Preserving Privacy of Outsourced Data: A Cluster-Based Approach
Sayi, T. J. V. R. K. M. K.
Krishna, R. K. N. Sai
Mukkamala, R.
Baruah, P. K.
2012 IEEE 13TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2012, : 215 - 223
[22] Linguistic and Graphical Explanation of a Cluster-Based Data Structure
Smits, Gregory
Pivert, Olivier
SCALABLE UNCERTAINTY MANAGEMENT (SUM 2015), 2015, 9310 : 186 - 200
[23] Optimizing data aggregation for cluster-based internet services
Chu, LK
Tang, H
Yang, T
Shen, K
ACM SIGPLAN NOTICES, 2003, 38 (10) : 119 - 130
[24] Cluster-based sampling approaches to imbalanced data distributions
Yen, Show-Jane
Lee, Yue-Shi
DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 427 - 436
[25] Cluster-Based Instance Selection for the Imbalanced Data Classification
Czarnowski, Ireneusz
Jedrzejowicz, Piotr
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2018, PT II, 2018, 11056 : 191 - 200
[26] VeSCA: Vehicular Stable Cluster-based Data Aggregation
Ucar, Seyhan
Ergen, Sinem Coleri
Ozkasap, Oznur
2014 INTERNATIONAL CONFERENCE ON CONNECTED VEHICLES AND EXPO (ICCVE), 2014, : 1080 - 1085
[27] A Cluster-Based Data Routing for Wireless Sensor Networks
Wang, Hao-Li
Chao, Yu-Yang
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PROCEEDINGS, 2009, 5574 : 129 - 136
[28] Data Deduplication Technology for Cloud Storage
He, Qinlu
Bian, Genqing
Shao, Bilin
Zhang, Weiqi
TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2020, 27 (05): : 1444 - 1451
[29] Cluster-Based Boosting
Miller, L. Dee
Soh, Leen-Kiat
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (06) : 1491 - 1504
[30] Cluster-based selection
Dunbar, JB
PERSPECTIVES IN DRUG DISCOVERY AND DESIGN, 1997, 7-8 : 51 - 63

← 1 2 3 4 5 →