Data structures for genome annotation, alternative splicing, and validation

被引:0
|
作者
Mielordt, Sven [1 ]
Grosse, Ivo
Kleffe, Juergen
机构
[1] Leibniz Inst Plant Genet & Crop Plant Res, IPK, D-06466 Gatersleben, Germany
[2] Charite Univ Med Berlin, UND Boinformat, Inst Mol Biol, D-14195 Berlin, Germany
关键词
gene and genome annotation; alternative splicing; data integration; splice template; validation and confirmation; quality control; Fasta-XML format;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
To establish a clean basis for studying alternative splicing and gene regulation in life science projects, a powerful data modeling and also a strict validation procedure for assigning levels of reliability to given gene models is essential. One common problem of public genome databases are insufficiently organized and linked description data, which make it difficult to study relations of the alternative isoforms of a gene that are relevant for medicine and plant genome research. This is a severe obstacle for the integration of biological data and motivated us to establish a new modeling instance and that we call splice template or sTMP. Every sTMP has a unique splicing pattern, but the length of the first and the last exon remains undefined. This allows to model different gene isoforms with the same splicing pattern. By utilizing this more fine-grained data structure, many cases of plurivalent mRNA-CDS relations are uncovered. There are more than 3,000 extra CDSs in the human genome compatible with the categories sTMP, mRNA and CDS, which exceed the classical one-to-one relations of mRNAs and CDSs. In one case, 11 extra CDSs are compatible with one mRNA. Crosslinks between mRNAs derived from different sTMPs leading to the same CDS are now accessible as well as disease-related ruptures in UTR regions. This allows discovering and validating disease and tissue specific differences in alternative splicing, gene expression and regulation. Another problem in public databases is a too much relaxed standard for labeling genes "confirmed by ESTs and full-length-cDNAs." We provide a pipeline that handles gene annotations from different sources, integrates them into complex gene models and assigns strict validation tags, constrained by a local low-error model for the alignments of genome annotation and transcripts. The data structures are being implemented and made publicly available at the Plant Data Warehouse of the Bioinformatics Center Gatersleben-Halle (http://portal.bic-gh.de/sTMP).
引用
收藏
页码:114 / 123
页数:10
相关论文
共 50 条
  • [21] Alternative splicing in multiple sclerosis: a systematic review of the literature and validation in transcriptome data
    Hecker, M.
    Ruege, A.
    Boxberger, N.
    Fitzner, B.
    Koczan, D.
    Schroeder, I.
    Thiesen, H. -J.
    Zettl, U.
    MULTIPLE SCLEROSIS JOURNAL, 2018, 24 : 877 - 878
  • [23] Validation of Mycobacteriophage Genome Annotation by Mass Spectrometry
    Li, Yi
    Ha, Soo Jung
    Andor, Zach
    Fatukasi, Mokunfope
    Hunnewell, Nathaniel
    Wu, Jiewei
    Clase, Kari
    FASEB JOURNAL, 2015, 29
  • [24] Introduction: Validation methods for function genome annotation
    Marvin Stodolsky
    BMC Genomics, 12
  • [25] SECONDARY STRUCTURES AND ALTERNATIVE SPLICING REGULATION
    MARIE, J
    CLOUETDORVAL, B
    DAUBENTONCARAFA, Y
    SIRANDPUGNET, P
    GALLEGO, M
    BRODY, E
    M S-MEDECINE SCIENCES, 1991, 7 (09): : 953 - 955
  • [26] Introduction: Validation methods for function genome annotation
    Stodolsky, Marvin
    BMC GENOMICS, 2011, 12
  • [27] Alternative splicing: combinatorial output from the genome
    Roberts, GC
    Smith, CWJ
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2002, 6 (03) : 375 - 383
  • [28] Alternative Splicing Landscape of the Drosophila melanogaster Genome
    Babenko, V. N.
    Aitnazarov, R. B.
    Goncharov, F. A.
    Zhimulev, I. F.
    RUSSIAN JOURNAL OF GENETICS, 2010, 46 (09) : 1036 - 1038
  • [29] Alternative splicing landscape of the Drosophila melanogaster genome
    V. N. Babenko
    R. B. Aitnazarov
    F. A. Goncharov
    I. F. Zhimulev
    Russian Journal of Genetics, 2010, 46 : 1036 - 1038
  • [30] An Intronic Signal for Alternative Splicing in the Human Genome
    Havlioglu, Necat
    Wang, Jun
    Fushimi, Kazuo
    Vibranovski, Maria D.
    Kan, Zhengyan
    Gish, Warren
    Fedorov, Alexei
    Long, Manyuan
    Wu, Jane Y.
    PLOS ONE, 2007, 2 (11):