Alignment-free distance measure based on return time distribution for sequence analysis: Applications to clustering, molecular phylogeny and subtyping

被引:41
|
作者
Kolekar, Pandurang [1 ]
Kale, Mohan [2 ]
Kulkarni-Kale, Urmila [1 ]
机构
[1] Univ Pune, Bioinformat Ctr, Pune 411007, Maharashtra, India
[2] Univ Pune, Dept Stat, Pune 411007, Maharashtra, India
关键词
Return time distribution; Alignment-free method; Molecular phylogeny; Dengue subtyping; Sequence analysis; Bioinformatics; DENGUE VIRUS TYPE-1; NATURAL-POPULATIONS; PROTEIN SEQUENCES; DNA-SEQUENCES; INFERENCE; EVOLUTION; RECOMBINATION; CONSTRUCTION; FLAVIVIRIDAE; SENSITIVITY;
D O I
10.1016/j.ympev.2012.07.003
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The data deluge in post-genomic era demands development of novel data mining tools. Existing molecular phylogeny analyses (MPAs) developed for individual gene/protein sequences are alignment-based. However, the size of genomic data and uncertainties associated with alignments, necessitate development of alignment-free methods for MPA. Derivation of distances between sequences is an important step in both, alignment-dependant and alignment-free methods. Various alignment-free distance measures based on oligo-nucleotide frequencies, information content, compression techniques, etc. have been proposed. However, these distance measures do not account for relative order of components viz. nucleotides or amino acids. A new distance measure, based on the concept of 'return time distribution' (RTD) of k-mers is proposed, which accounts for the sequence composition and their relative orders. Statistical parameters of RTDs are used to derive a distance function. The resultant distance matrix is used for clustering and phylogeny using Neighbor-joining. Its performance for MPA and subtyping was evaluated using simulated data generated by block-bootstrap, receiver operating characteristics and leave-one-out cross validation methods. The proposed method was successfully applied for MPA of family Flaviviridae and subtyping of Dengue viruses. It is observed that method retains resolution for classification and subtyping of viruses at varying levels of sequence similarity and taxonomic hierarchy. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:510 / 522
页数:13
相关论文
共 13 条
  • [1] An Alignment-free Heuristic for Fast Sequence Comparisons with Applications to Phylogeny Reconstruction
    Pannu, Jodh
    Chockalingam, Sriram P.
    Thankachan, Sharma, V
    Aluru, Srinivas
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 540 - 540
  • [2] An alignment-free heuristic for fast sequence comparisons with applications to phylogeny reconstruction
    Chockalingam, Sriram P.
    Pannu, Jodh
    Hooshmand, Sahar
    Thankachan, Sharma, V
    Aluru, Srinivas
    BMC BIOINFORMATICS, 2020, 21 (Suppl 6)
  • [3] An alignment-free heuristic for fast sequence comparisons with applications to phylogeny reconstruction
    Sriram P. Chockalingam
    Jodh Pannu
    Sahar Hooshmand
    Sharma V. Thankachan
    Srinivas Aluru
    BMC Bioinformatics, 21
  • [4] Application of Sequence Alignment-Free Comparison-Based SeqDistK to Microbial Flora Clustering
    Liu X.
    Huang G.
    Huang T.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2019, 47 (11): : 71 - 77
  • [5] An alignment-free measure based on physicochemical properties of amino acids for protein sequence comparison
    Zhao, Yunxiu
    Xue, Xiaolong
    Xie, Xiaoli
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 80 : 10 - 15
  • [6] Alignment-free sequence comparison method based on whole genomes and its application to virus phylogeny
    College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
    不详
    Tien Tzu Hsueh Pao, 2006, 2 (277-281):
  • [7] WNV Typer: A server for genotyping of West Nile viruses using an alignment-free method based on a return time distribution
    Kolekar, Pandurang
    Hake, Nilesh
    Kale, Mohan
    Kulkarni-Kale, Urmila
    JOURNAL OF VIROLOGICAL METHODS, 2014, 198 : 41 - 55
  • [8] A phylogenetic analysis of the Brassicales clade based on an alignment-free sequence comparison method
    Hatje, Klas
    Kollmar, Martin
    FRONTIERS IN PLANT SCIENCE, 2012, 3
  • [9] Comparative analysis of alignment-free genome clustering and whole genome alignment-based phylogenomic relationship of coronaviruses
    Kirichenko, Anastasiya D.
    Poroshina, Anastasiya A.
    Sherbakov, Dmitry Yu
    Sadovsky, Michael G.
    Krutovsky, Konstantin, V
    PLOS ONE, 2022, 17 (03):
  • [10] Network Subgraph-based Method: Alignment-free Technique for Molecular Network Analysis
    Zaenudin, Efendi
    Wijaya, Ezra B.
    Mekala, Venugopal Reddy
    Ng, Ka-Lok
    CURRENT BIOINFORMATICS, 2024, 19 (08) : 777 - 792