CUDA-enabled hierarchical ward clustering of protein structures based on the nearest neighbour chain algorithm

被引:5
|
作者
Dang, Hoang-Vu [1 ]
Schmidt, Bertil [1 ]
Hildebrandt, Andreas [1 ]
Tran, Tuan Tu [1 ]
Hildebrandt, Anna Katharina [2 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Informat, Mainz, Germany
[2] Max Planck Inst Informat, Saarbrucken, Germany
来源
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS | 2016年 / 30卷 / 02期
关键词
CUDA; protein structures; clustering; bioinformatics; protein docking; TOOL;
D O I
10.1177/1094342015597988
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering of molecular systems according to their three-dimensional structure is an important step in many bioinformatics workflows. In applications such as docking or structure prediction, many algorithms initially generate large numbers of candidate poses (or decoys), which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates can easily range from thousands to millions, performing the clustering on standard central processing units (CPUs) is highly time consuming. In this paper, we analyse and evaluate different approaches to parallelize the nearest neighbour chain algorithm to perform hierarchical Ward clustering of protein structures, using both atom-based root mean square deviation (RMSD) and rigid-body RMSD molecular distances on a graphics processing unit (GPU). This leads to a speedup of around one order of magnitude of our CUDA implementation on a GeForce Titan GPU compared to a multi-threaded CPU implementation on a Core-i7 2700. Furthermore, the runtimes compare favourably with ClusCo, another state-of-the-art CUDA-enabled protein structure clustering method, while achieving similar accuracy on the iTasser benchmark dataset. Our implementation has also been incorporated into the Biochemical Algorithms library to allow easy integration into biologists' workflows.
引用
收藏
页码:200 / 211
页数:12
相关论文
共 21 条
  • [1] CUDA-enabled hierarchical ward clustering of protein structures based on the nearest neighbour chain algorithm (Reprinted from vol 30, pg 200-211, 2016)
    Hoang-Vu Dang
    Schmidt, Bertil
    Hildebrandt, Andreas
    Tuan Tu Tran
    Hildebrandt, Anna Katharina
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2017, 31 (03): : 181 - +
  • [2] Parallelized Clustering of Protein Structures on CUDA-enabled GPUs
    Hoang-Vu Dang
    Schmidt, Bertil
    Hildebrand, Andreas
    Hildebrandt, Anna Katharina
    2014 22ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2014), 2014, : 1 - 8
  • [3] IMPLEMENTATION OF A COVARIANCE-BASED PRINCIPAL COMPONENT ANALYSIS ALGORITHM WITH A CUDA-ENABLED GRAPHICS PROCESSING UNIT
    Zhang, Jian
    Lim, Kim Hwa
    2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 1759 - 1762
  • [4] A novel approach to text clustering using genetic algorithm based on the nearest neighbour heuristic
    Mustafi D.
    Mustafi A.
    Sahoo G.
    International Journal of Computers and Applications, 2022, 44 (03) : 291 - 303
  • [5] Nearest Neighbor-Clustering Algorithm Based on Hierarchical Optimization Strategy
    Wang Jie
    Jiang Guoqiang
    2008 WORKSHOP ON POWER ELECTRONICS AND INTELLIGENT TRANSPORTATION SYSTEM, PROCEEDINGS, 2008, : 233 - 236
  • [6] Evaluation of journal subject stability based on ward hierarchical clustering algorithm
    Li, X., 1600, Editorial Board of Medical Journal of Wuhan University (37):
  • [7] A multilevel k-nearest neighbour learning algorithm based on k-means clustering
    Ying, Xu
    2007 International Symposium on Computer Science & Technology, Proceedings, 2007, : 250 - 253
  • [8] A Novel Local Density Hierarchical Clustering Algorithm Based on Reverse Nearest Neighbors
    Liu, Yaohui
    Liu, Dong
    Yu, Fang
    Ma, Zhengming
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [9] A multi-relational hierarchical clustering algorithm based on shared nearest neighbor similarity
    Guo, Jing-Feng
    Zhao, Yu-Yan
    Li, Jing
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3951 - 3955
  • [10] CUDASW++2.0: Enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions
    Liu Y.
    Schmidt B.
    Maskell D.L.
    BMC Research Notes, 3 (1)