A novel hierarchical clustering algorithm for gene sequences

被引:73
|
作者
Wei, Dan [1 ,2 ,3 ]
Jiang, Qingshan [1 ]
Wei, Yanjie [1 ]
Wang, Shengrui [4 ]
机构
[1] Chinese Acad Sci, Shenzhen Key Lab High Performance Data Min, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[2] Xiamen Univ, Dept Cognit Sci, Xiamen, Peoples R China
[3] Xiamen Univ, Fujian Key Lab Brain Intelligent Syst, Xiamen, Peoples R China
[4] Univ Sherbrooke, Dept Comp Sci, Sherbrooke, PQ J1K 2R1, Canada
来源
BMC BIOINFORMATICS | 2012年 / 13卷
基金
中国国家自然科学基金;
关键词
ALIGNMENT; DNA; PROTEIN; DISSIMILARITY; CLASSIFICATION; FREQUENCIES; SIMILARITY; DISTANCE; ENTROPY; SETS;
D O I
10.1186/1471-2105-13-174
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of k-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors. Results: The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences. Conclusions: We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] A novel hierarchical clustering algorithm for gene sequences
    Dan Wei
    Qingshan Jiang
    Yanjie Wei
    Shengrui Wang
    BMC Bioinformatics, 13
  • [2] NHCR: A Novel Hierarchical Clustering Routing Algorithm
    Liu, Yuhua
    Jia, Yoncan
    Gao, Jingju
    2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 3572 - 3575
  • [3] A Novel Hierarchical Clustering Algorithm for Online Resources
    Agarwal, Amit
    Roul, Rajendra Kumar
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 467 - 476
  • [4] An optimal hierarchical clustering algorithm for gene expression data
    Seal, S
    Komarina, S
    Aluru, S
    INFORMATION PROCESSING LETTERS, 2005, 93 (03) : 143 - 147
  • [5] Customer Segmentation based on a Novel Hierarchical Clustering Algorithm
    Cao, Suqun
    Zhu, Quanyin
    Hou, Zhiwei
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 969 - +
  • [6] A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis
    Li, Shaoning
    Li, Wenjing
    Qiu, Jia
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2017, 6 (01):
  • [7] A Novel Hierarchical Clustering Routing Algorithm for Wireless Sensor Networks
    Xu, Kaihua
    Jia, Yongcan
    Liu, Yuhua
    ICICSE: 2008 INTERNATIONAL CONFERENCE ON INTERNET COMPUTING IN SCIENCE AND ENGINEERING, PROCEEDINGS, 2008, : 282 - +
  • [8] A novel algorithm for detecting multiple covariance and clustering of biological sequences
    Shen, Wei
    Li, Yan
    SCIENTIFIC REPORTS, 2016, 6
  • [9] A novel algorithm for detecting multiple covariance and clustering of biological sequences
    Wei Shen
    Yan Li
    Scientific Reports, 6
  • [10] Rapid hierarchical clustering of biological sequences
    Chappell, Timothy
    Geva, Shlomo
    Hogan, James
    Perrin, Dimitri
    ADCS'18: PROCEEDINGS OF THE 23RD AUSTRALASIAN DOCUMENT COMPUTING SYMPOSIUM, 2018,