共 50 条
Motif models for RNA-binding proteins
被引:14
|作者:
Sasse, Alexander
[1
]
Laverty, Kaitlin U.
[1
]
Hughest, Timothy R.
[1
,2
,3
]
Morris, Quaid D.
[1
,2
,4
]
机构:
[1] Univ Toronto, Dept Mol Genet, 100 Coll St, Toronto, ON M5S 1A8, Canada
[2] Univ Toronto, Donnelly Ctr, Toronto, ON M5S 3E1, Canada
[3] MaRS Ctr, Canadian Inst Adv Res, West Tower,661 Univ Ave,Suite 505, Toronto, ON M5G 1M1, Canada
[4] Univ Toronto, Dept Comp Sci, Toronto, ON M5T 3A1, Canada
关键词:
SEQUENCE-STRUCTURE MOTIFS;
REGULATORY SEQUENCE;
WEB SERVER;
IDENTIFICATION;
PREFERENCES;
SEQ;
SPECIFICITIES;
PATTERNS;
D O I:
10.1016/j.sbi.2018.08.001
中图分类号:
Q5 [生物化学];
Q7 [分子生物学];
学科分类号:
071010 ;
081704 ;
摘要:
Identifying the binding preferences of RNA-binding proteins (RBPs) is important in understanding their contribution to post-transcriptional regulation. Here, we review the current state-of the art of RNA motif identification tools for RBPs. New in vivo and in vitro data sets provide sufficient statistical power to enable detection of relatively long and complex sequence and sequence-structure binding preferences, and recent computational methods are geared towards quantitative identification of these patterns. We classify methods by their motif model's representational power and describe the underlying considerations for RNA-protein interactions. All classical motif identification algorithms apply physically motivated architectures, consisting of a motif and an occupancy model, we call these explicit motif models. Recent methods, such as convolutional neural networks and support vector machines, abandon the classical architecture and implicitly model RNA binding without defining a motif model. Although they achieve high accuracy on held-out data they may be unsuitable to solve the ultimate goal of the field, using motifs trained on in vitro data to predict in vivo binding sites. For this task methods need to separate intrinsic binding preferences from cellular effects from protein and RNA concentrations, cooperativity, and competition. To tackle this problem, we advocate for the use of a 'three-layer' architecture, consisting of motif model, occupancy model, and extrinsic factor model, which enables separation and adjustment to cellular conditions.
引用
收藏
页码:115 / 123
页数:9
相关论文