Data structures for genome annotation, alternative splicing, and validation

被引:0
|
作者
Mielordt, Sven [1 ]
Grosse, Ivo
Kleffe, Juergen
机构
[1] Leibniz Inst Plant Genet & Crop Plant Res, IPK, D-06466 Gatersleben, Germany
[2] Charite Univ Med Berlin, UND Boinformat, Inst Mol Biol, D-14195 Berlin, Germany
关键词
gene and genome annotation; alternative splicing; data integration; splice template; validation and confirmation; quality control; Fasta-XML format;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
To establish a clean basis for studying alternative splicing and gene regulation in life science projects, a powerful data modeling and also a strict validation procedure for assigning levels of reliability to given gene models is essential. One common problem of public genome databases are insufficiently organized and linked description data, which make it difficult to study relations of the alternative isoforms of a gene that are relevant for medicine and plant genome research. This is a severe obstacle for the integration of biological data and motivated us to establish a new modeling instance and that we call splice template or sTMP. Every sTMP has a unique splicing pattern, but the length of the first and the last exon remains undefined. This allows to model different gene isoforms with the same splicing pattern. By utilizing this more fine-grained data structure, many cases of plurivalent mRNA-CDS relations are uncovered. There are more than 3,000 extra CDSs in the human genome compatible with the categories sTMP, mRNA and CDS, which exceed the classical one-to-one relations of mRNAs and CDSs. In one case, 11 extra CDSs are compatible with one mRNA. Crosslinks between mRNAs derived from different sTMPs leading to the same CDS are now accessible as well as disease-related ruptures in UTR regions. This allows discovering and validating disease and tissue specific differences in alternative splicing, gene expression and regulation. Another problem in public databases is a too much relaxed standard for labeling genes "confirmed by ESTs and full-length-cDNAs." We provide a pipeline that handles gene annotations from different sources, integrates them into complex gene models and assigns strict validation tags, constrained by a local low-error model for the alignments of genome annotation and transcripts. The data structures are being implemented and made publicly available at the Plant Data Warehouse of the Bioinformatics Center Gatersleben-Halle (http://portal.bic-gh.de/sTMP).
引用
收藏
页码:114 / 123
页数:10
相关论文
共 50 条
  • [31] Alternative splicing in the genome of HPV and its regulation
    Wang, Yaping
    Chen, Fang
    Qu, Wenjie
    Gong, Yingxin
    Wang, Yan
    Chen, Limei
    Zhou, Qi
    Mo, Jiayin
    Zhang, Hongwei
    Lin, Lin
    Bi, Tianyi
    Wang, Xujie
    Gu, Jiashi
    Li, Yanyun
    Sui, Long
    FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY, 2024, 14
  • [32] New Tools for Expression Alternative Splicing Validation
    Bevilacqua, Vitoantonio
    Picardi, Ernesto
    Pesole, Graziano
    Ranieri, Daniele
    Stola, Vincenzo
    Reno, Vito
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, 2010, 93 : 222 - +
  • [33] PacBio Iso-Seq Improves the Rainbow Trout Genome Annotation and Identifies Alternative Splicing Associated With Economically Important Phenotypes
    Ali, Ali
    Thorgaard, Gary H. H.
    Salem, Mohamed
    FRONTIERS IN GENETICS, 2021, 12
  • [34] Data analysis of alternative splicing microarrays
    Cuperlovic-Culf, Miroslava
    Belacel, Nabil
    Culf, Adrian S.
    Ouellette, Rodney J.
    DRUG DISCOVERY TODAY, 2006, 11 (21-22) : 983 - 990
  • [35] The effects of alternative splicing on transmembrane proteins in the mouse genome
    Cline, MS
    Shigeta, R
    Wheeler, RL
    Siani-Rose, MA
    Kulp, D
    Loraine, AE
    PACIFIC SYMPOSIUM ON BIOCOMPUTING 2004, 2003, : 17 - 28
  • [36] ALTERNATIVE SPLICING CAUSED BY LENTIVIRAL INTEGRATION IN THE HUMAN GENOME
    Moiani, Arianna
    Mavilio, Fulvio
    METHODS IN ENZYMOLOGY, VOL 507: GENE TRANSFER VECTORS FOR CLINICAL APPLICATION, 2012, 507 : 155 - 169
  • [37] How prevalent is functional alternative splicing in the human genome?
    Sorek, R
    Shamir, R
    Ast, G
    TRENDS IN GENETICS, 2004, 20 (02) : 68 - 71
  • [38] Practices of grid computing for genome alternative splicing analysis
    Yu, Huashan
    Lei, Kong
    SIXTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2007, : 622 - +
  • [39] Relating alternative splicing to proteome complexity and genome evolution
    Xing, Yi
    Lee, Christopher
    ALTERNATIVE SPLICING IN THE POSTGENOMIC ERA, 2007, 623 : 36 - 49
  • [40] Tools to Covisualize and Coanalyze Proteomic Data with Genomes and Transcriptomes: Validation of Genes and Alternative mRNA Splicing
    Pang, Chi Nam Ignatius
    Tay, Aidan P.
    Aya, Carlos
    Twine, Natalie A.
    Harkness, Linda
    Hart-Smith, Gene
    Chia, Samantha Z.
    Chen, Zhiliang
    Deshpande, Nandan P.
    Kaakoush, Nadeem O.
    Mitchell, Hazel M.
    Kassem, Moustapha
    Wilkins, Marc R.
    JOURNAL OF PROTEOME RESEARCH, 2014, 13 (01) : 84 - 98