Algorithms for matching partially labelled sequence graphs

被引:0
|
作者
Taylor, William R. [1 ]
机构
[1] Francis Crick Inst, 1 Midland Rd, London NW1 1AT, England
基金
英国惠康基金;
关键词
Phylogenetic tree matching; Correlated substitution analysis; Bipartite graph matching; PROTEIN; CONTACTS; COEVOLUTION; PREDICTION;
D O I
10.1186/s13015-017-0115-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In order to find correlated pairs of positions between proteins, which are useful in predicting interactions, it is necessary to concatenate two large multiple sequence alignments such that the sequences that are joined together belong to those that interact in their species of origin. When each protein is unique then the species name is sufficient to guide this match, however, when there are multiple related sequences (paralogs) in each species then the pairing is more difficult. In bacteria a good guide can be gained from genome co-location as interacting proteins tend to be in a common operon but in eukaryotes this simple principle is not sufficient. Results: The methods developed in this paper take sets of paralogs for different proteins found in the same species and make a pairing based on their evolutionary distance relative to a set of other proteins that are unique and so have a known relationship (singletons). The former constitute a set of unlabelled nodes in a graph while the latter are labelled. Two variants were tested, one based on a phylogenetic tree of the sequences (the topology-based method) and a simpler, faster variant based only on the inter-sequence distances (the distance-based method). Over a set of test proteins, both gave good results, with the topology method performing slightly better. Conclusions: The methods develop here still need refinement and augmentation from constraints other than the sequence data alone, such as known interactions from annotation and databases, or non-trivial relationships in genome location. With the ever growing numbers of eukaryotic genomes, it is hoped that the methods described here will open a route to the use of these data equal to the current success attained with bacterial sequences.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] THE RECONSTRUCTION PROBLEM OF PARTIALLY LABELED GRAPHS WITH PRESCRIBED DEGREE SEQUENCE
    叶秀明
    邵学才
    ScienceBulletin, 1986, (18) : 1292 - 1293
  • [22] THE RECONSTRUCTION PROBLEM OF PARTIALLY LABELED GRAPHS WITH PRESCRIBED DEGREE SEQUENCE
    YE, XM
    SHAO, XC
    KEXUE TONGBAO, 1986, 31 (18): : 1292 - 1293
  • [23] Markovian Online Matching Algorithms on Large Bipartite Random Graphs
    Mohamed Habib Aliou Diallo Aoudi
    Pascal Moyal
    Vincent Robin
    Methodology and Computing in Applied Probability, 2022, 24 : 3195 - 3225
  • [24] Distributed local approximation algorithms for maximum matching in graphs and hypergraphs
    Harris, David G.
    2019 IEEE 60TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2019), 2019, : 700 - 724
  • [25] GPU Accelerated Maximum Cardinality Matching Algorithms for Bipartite Graphs
    Deveci, Mehmet
    Kaya, Kamer
    Ucar, Bora
    Catalyuerek, Uemit V.
    EURO-PAR 2013 PARALLEL PROCESSING, 2013, 8097 : 850 - 861
  • [26] Parallel maximum matching algorithms in interval graphs (extended abstract)
    Chung, YJ
    Park, K
    Cho, YK
    1997 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, : 602 - 609
  • [27] Parallel approximation algorithms for maximum weighted matching in general graphs
    Uehara, R
    Chen, ZZ
    INFORMATION PROCESSING LETTERS, 2000, 76 (1-2) : 13 - 17
  • [28] Quantum Time Complexity and Algorithms for Pattern Matching on Labeled Graphs
    Darbaril, Parisa
    Gibney, Daniel
    Thankachan, Sharma, V
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2022, 2022, 13617 : 303 - 314
  • [29] Flexible and Efficient Algorithms for Abelian Matching in Genome Sequence
    Faro, Simone
    Pavone, Arianna
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING, IWBBIO 2019, PT I, 2019, 11465 : 307 - 318
  • [30] Markovian Online Matching Algorithms on Large Bipartite Random Graphs
    Aoudi, Mohamed Habib Aliou Diallo
    Moyal, Pascal
    Robin, Vincent
    METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2022, 24 (04) : 3195 - 3225