Prediction of Motifs Based on a Repeated-Measures Model for Integrating Cross-Species Sequence and Expression Data

被引:3
|
作者
Siewert, Elizabeth A. [1 ]
Kechris, Katerina J. [1 ]
机构
[1] Univ Colorado, Denver, CO 80202 USA
关键词
GENE-EXPRESSION; REGULATORY ELEMENTS; DISCOVERY; NETWORKS; CONSERVATION; COEXPRESSION; EVOLUTION; SELECTION; PROFILES; PATTERNS;
D O I
10.2202/1544-6115.1464
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
De novo identification of transcription factor binding sites (TFBS) is a challenging computational problem because TFBSs are relatively short sequences buried in long genomic regions. Earlier methods incorporated genome-wide expression data and promoter sequences into a linear-model framework, regressing expression on counts of putative TFBSs in promoters for a single species. More recently, it has been shown that examining sequence data across multiple species improves the prediction of TFBSs. In this work, we describe an extension of the single-species, linear-model framework for the analysis of paired cross-species sequence and expression data. A repeated measures model for gene-expression measurements across species is used, accounting for phylogenetic relationships among species through the error covariance structure. This multiple-species algorithm is applied to a data set of four yeast species grown under heat-shock conditions and comparisons are made to the single species algorithm. Using evaluations based on transcription factor binding strength and an independent source of expression data, we find the multiple species results show an improvement in the prediction of TFBS.
引用
收藏
页数:36
相关论文
共 50 条
  • [21] Using a random forest model for cross-species prediction of crop arsenic contamination
    Zhu, Qiaohui
    Luo, Jie
    Cao, Min
    Feng, Siyao
    Jia, Heran
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2025,
  • [22] Cross-species gene expression analysis for gene-based immunoprofiling
    Banno, Eri
    Kura, Yurie
    Sakai, Kazuko
    Fujita, Yoshihiko
    Nozawa, Masahiro
    Yoshikawa, Kazuhiro
    Nishio, Kazuto
    De Velasco, Marco A.
    Uemura, Hirotsugu
    CANCER SCIENCE, 2021, 112 : 521 - 521
  • [23] Time Series Data Prediction Based on Sequence to Sequence Model
    Yang, Chao
    Guo, Zhongwen
    Xian, Lintao
    2019 5TH INTERNATIONAL CONFERENCE ON MECHANICAL ENGINEERING AND AUTOMATION SCIENCE (ICMEAS 2019), 2019, 692
  • [24] Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio
    Hatje K.
    Keller O.
    Hammesfahr B.
    Pillmann H.
    Waack S.
    Kollmar M.
    BMC Research Notes, 4 (1)
  • [25] A deep RNA sequencing study of mammalian sperm RNA: identifying common cross-species expression motifs indicating functionality
    Nadj, S.
    Miller, D.
    HUMAN REPRODUCTION, 2014, 29 : 287 - 287
  • [26] ConSite: web-based prediction of regulatory elements using cross-species comparison
    Sandelin, A
    Wasserman, WW
    Lenhard, B
    NUCLEIC ACIDS RESEARCH, 2004, 32 : W249 - W252
  • [27] Endomebase: A model cross-species tissue-specific gene and protein expression database
    Bradshaw, PC
    Springer, GK
    Young, SL
    AMIA 2002 SYMPOSIUM, PROCEEDINGS: BIOMEDICAL INFORMATICS: ONE DISCIPLINE, 2002, : 982 - 982
  • [28] ConFindr: rapid detection of intraspecies and cross-species contamination in bacterial whole-genome sequence data
    Low, Andrew J.
    Koziol, Adam G.
    Manninger, Paul A.
    Blais, Burton
    Carrillo, Catherine D.
    PEERJ, 2019, 7
  • [29] CycleGAN based confusion model for cross-species plant disease image migration
    Cui, Xiaohui
    Ying, Yongzhi
    Chen, Zhibo
    Chen, Zhibo (zhibo@bjfu.edu.cn), 1600, IOS Press BV (41): : 6685 - 6696
  • [30] Estimation of Cross-Species Introgression Rates Using Genomic Data Despite Model Unidentifiability
    Yang, Ziheng
    Flouri, Tomas
    MOLECULAR BIOLOGY AND EVOLUTION, 2022, 39 (05)