A novel hierarchical clustering algorithm for gene sequences

被引:73
|
作者
Wei, Dan [1 ,2 ,3 ]
Jiang, Qingshan [1 ]
Wei, Yanjie [1 ]
Wang, Shengrui [4 ]
机构
[1] Chinese Acad Sci, Shenzhen Key Lab High Performance Data Min, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[2] Xiamen Univ, Dept Cognit Sci, Xiamen, Peoples R China
[3] Xiamen Univ, Fujian Key Lab Brain Intelligent Syst, Xiamen, Peoples R China
[4] Univ Sherbrooke, Dept Comp Sci, Sherbrooke, PQ J1K 2R1, Canada
来源
BMC BIOINFORMATICS | 2012年 / 13卷
基金
中国国家自然科学基金;
关键词
ALIGNMENT; DNA; PROTEIN; DISSIMILARITY; CLASSIFICATION; FREQUENCIES; SIMILARITY; DISTANCE; ENTROPY; SETS;
D O I
10.1186/1471-2105-13-174
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of k-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors. Results: The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences. Conclusions: We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] A Novel Local Density Hierarchical Clustering Algorithm Based on Reverse Nearest Neighbors
    Liu, Yaohui
    Liu, Dong
    Yu, Fang
    Ma, Zhengming
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [32] Hierarchical link clustering algorithm in networks
    Bodlaj, Jernej
    Batagelj, Vladimir
    PHYSICAL REVIEW E, 2015, 91 (06)
  • [33] A novel split-and-merge algorithm for hierarchical clustering of Gaussian mixture models
    Popovic, Branislav
    Janev, Marko
    Pekar, Darko
    Jakovljevic, Niksa
    Gnjatovic, Milan
    Secujski, Milan
    Delic, Vlado
    APPLIED INTELLIGENCE, 2012, 37 (03) : 377 - 389
  • [34] A novel split-and-merge algorithm for hierarchical clustering of Gaussian mixture models
    Branislav Popović
    Marko Janev
    Darko Pekar
    Nikša Jakovljević
    Milan Gnjatović
    Milan Sečujski
    Vlado Delić
    Applied Intelligence, 2012, 37 : 377 - 389
  • [35] Hierarchical clustering algorithm based on granularity
    Liang, Jiuzhen
    Li, Guangbin
    GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 429 - 432
  • [36] An adaptive parallel hierarchical clustering algorithm
    Li, Zhaopeng
    Li, Kenli
    Xiao, Degui
    Yang, Lei
    HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2007, 4782 : 97 - 107
  • [37] A TRANSFER ALGORITHM FOR HIERARCHICAL-CLUSTERING
    SCHADER, M
    MATHEMATICAL SOCIAL SCIENCES, 1982, 2 (02) : 189 - 197
  • [38] Dynamic hierarchical compact clustering algorithm
    Gil-García, R
    Badía-Contelles, JM
    Pons-Porrata, A
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2005, 3773 : 302 - 310
  • [39] A hierarchical clustering algorithm based on GiST
    Zhou, Bing
    Wang, He-xing
    Wang, Cui-rong
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF CONTEMPORARY INTELLIGENT COMPUTING TECHNIQUES, 2007, 2 : 125 - +
  • [40] A scalable hierarchical algorithm for unsupervised clustering
    Boley, D
    DATA MINING FOR SCIENTIFIC AND ENGINEERING APPLICATIONS, 2001, 2 : 383 - 400