Motif-Aware PRALINE: Improving the alignment of motif regions

被引:7
|
作者
Dijkstra, Maurits [1 ]
Bawono, Punto [1 ]
Abeln, Sanne [1 ]
Feenstra, K. Anton [1 ]
Fokkink, Wan [1 ]
Heringa, Jaap [1 ]
机构
[1] Vrije Univ Amsterdam, Dept Comp Sci, Amsterdam, Netherlands
关键词
SEQUENCE ALIGNMENT; PARACOCCUS-DENITRIFICANS; MULTIPLE; DATABASE; REDUCTASE;
D O I
10.1371/journal.pcbi.1006547
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein or DNA motifs are sequence regions which possess biological importance. These regions are often highly conserved among homologous sequences. The generation of multiple sequence alignments (MSAs) with a correct alignment of the conserved sequence motifs is still difficult to achieve, due to the fact that the contribution of these typically short fragments is overshadowed by the rest of the sequence. Here we extended the PRALINE multiple sequence alignment program with a novel motif-aware MSA algorithm in order to address this shortcoming. This method can incorporate explicit information about the presence of externally provided sequence motifs, which is then used in the dynamic programming step by boosting the amino acid substitution matrix towards the motif. The strength of the boost is controlled by a parameter, a. Using a benchmark set of alignments we confirm that a good compromise can be found that improves the matching of motif regions while not significantly reducing the overall alignment quality. By estimating a on an unrelated set of reference alignments we find there is indeed a strong conservation signal for motifs. A number of typical but difficult MSA use cases are explored to exemplify the problems in correctly aligning functional sequence motifs and how the motif-aware alignment method can be employed to alleviate these problems.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] GRAFIMO: Variant and haplotype aware motif scanning on pangenome graphs
    Tognon, Manuel
    Bonnici, Vincenzo
    Garrison, Erik
    Giugno, Rosalba
    Pinello, Luca
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (09)
  • [42] A disk-aware algorithm for time series motif discovery
    Mueen, Abdullah
    Keogh, Eamonn
    Zhu, Qiang
    Cash, Sydney S.
    Westover, M. Brandon
    Bigdely-Shamlo, Nima
    DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 22 (1-2) : 73 - 105
  • [43] A disk-aware algorithm for time series motif discovery
    Abdullah Mueen
    Eamonn Keogh
    Qiang Zhu
    Sydney S. Cash
    M. Brandon Westover
    Nima Bigdely-Shamlo
    Data Mining and Knowledge Discovery, 2011, 22 : 73 - 105
  • [44] RGG/RG Motif Regions in RNA Binding and Phase Separation
    Chong, P. Andrew
    Vernon, Robert M.
    Forman-Kay, Julie D.
    JOURNAL OF MOLECULAR BIOLOGY, 2018, 430 (23) : 4650 - 4665
  • [45] A new protein linear motif benchmark for multiple sequence alignment software
    Perrodou, Emmanuel
    Chica, Claudia
    Poch, Olivier
    Gibson, Toby J.
    Thompson, Julie D.
    BMC BIOINFORMATICS, 2008, 9 (1)
  • [46] A new protein linear motif benchmark for multiple sequence alignment software
    Emmanuel Perrodou
    Claudia Chica
    Olivier Poch
    Toby J Gibson
    Julie D Thompson
    BMC Bioinformatics, 9
  • [47] MOTIF RECOGNITION AND ALIGNMENT FOR MANY SEQUENCES BY COMPARISON OF DOT-MATRICES
    VINGRON, M
    ARGOS, P
    JOURNAL OF MOLECULAR BIOLOGY, 1991, 218 (01) : 33 - 43
  • [48] An Algorithm to Solve the Motif Alignment Problem for Approximate Nested Tandem Repeats
    Matroud, Atheer A.
    Hendy, Michael D.
    Tuffley, Christopher P.
    COMPARATIVE GENOMICS, 2010, 6398 : 188 - +
  • [49] Context-Aware Semi-Supervised Motif Detection Approach
    Ibrahim, Rania
    Ghanem, Nagia
    Ismail, Mohamed A.
    2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 3953 - 3956
  • [50] Improving sign prediction of network embedding by adding motif features
    Liu, Si-Yuan
    Xiao, Jing
    Xu, Xiao-Ke
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2022, 593