Clustering-based and consistent hashing-aware data placement algorithm

被引:4
|
作者
Chen T. [1 ]
Xiao N. [1 ]
Liu F. [1 ]
Fu C.-S. [1 ]
机构
[1] School of Computer, National University of Defense Technology
来源
Ruan Jian Xue Bao/Journal of Software | 2010年 / 21卷 / 12期
关键词
Clustering algorithm; Consistent hashing; Data placement; Fair; Self-adaptive;
D O I
10.3724/SP.J.1001.2010.03706
中图分类号
学科分类号
摘要
Large-Scale network storage systems are confronted with the big challenge of efficiently distributing data among storage devices. It's necessary to design an efficient, fair and adaptive data placement algorithm. This paper has developed an algorithm CCHDP (clustering-based and consistent hashing-aware data placement) to distribute data over heterogeneous devices in the systems. It combines clustering algorithm and consistent hashing, saving much memory space by avoiding extra virtual devices. The analysis and experiments show that CCHDP can notonly assign data evenly among devices and adapt well with the additions or departures of devices for the number of data moved is nearly equal to the optimal amount in the events of devices changes. Moreover, CCHDP is time efficient with little memory overhead. © by Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:3175 / 3185
页数:10
相关论文
共 21 条
  • [1] Gray J., What next? A few remaining problems in information technology, (1998)
  • [2] Welch B., Unangst M., Abbasi Z., Gibson G., Scalable performance of the Panasas parallel file system, Proc. of the 2008 Conf. on File and Storage Technologies (FAST 2008), pp. 17-33, (2008)
  • [3] Karger D., Lehman E., Leighton T., Levine M., Lewin D., Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web, Proc. of the 29th Annual ACM Symp. on Theory of Computing (STOC'97), pp. 654-663, (1997)
  • [4] Tang H., Gulbeden A., Zhou J.Y., Strathearn W., Yang T., Chu L.K., A self-organizing storage cluster for parallel data-intensive applications, Proc. of the 2004 ACM/IEEE Conf. on Supercomputing (SC 2004), pp. 52-63, (2004)
  • [5] Stoica I., Morris R., Karger D., Kaashoek F., Balakrishnan H., Chord: A scalable peer-to-peer lookup service for internet applications, Proc. of the Annual Conf. of the Special Interest Group on Data Communication (SIGCOMM 2001), pp. 149-160, (2001)
  • [6] Brinkmann A., Salzwedel K., Scheideler C., Efficient, distributed data placement strategies for storage area networks, Proc. of the 12th ACM Symp. on Parallel Algorithms and Architectures, pp. 119-128, (2000)
  • [7] Brinkmann A., Salzwedel K., Scheideler C., Compact, adaptive placement schemes for non-uniform distribution requirements, Proc. of the 14th ACM Symp. on Parallel Algorithms and Architectures, pp. 53-62, (2002)
  • [8] Schindelhauer C., Schomaker G., Weighted distributed hash tables, Proc. of the 17th ACM Symp. on Parallelism in Algorithms and Architectures (SPAA 2005), pp. 218-227, (2005)
  • [9] Honicky R.J., Miller E.L., A fast algorithm for online placement and reorganization of replicated data, Proc. of the 17th Int'l Parallel and Distributed Processing Symp. (IPDPS 2003), (2003)
  • [10] Honicky R.J., Miller E.L., Replication under scalable hashing: A family of algorithms for scalable decentralized data distribution, Proc. of the 18th Int'l Parallel and Distributed Processing Symp. (IPDPS 2004), (2004)