YeastHub: a semantic web use case for integrating data in the life sciences domain

被引:59
作者
Cheung, KH [1 ]
Yip, KY
Smith, A
deKnikker, R
Masiar, A
Gerstein, M
机构
[1] Yale Univ, Ctr Med Informat, New Haven, CT 06520 USA
[2] Yale Univ, Dept Anesthesiol, New Haven, CT 06520 USA
[3] Yale Univ, Dept Genet, New Haven, CT 06520 USA
[4] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
[5] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/bti1026
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: As the semantic web technology is maturing and the need for life sciences data integration over the web is growing, it is important to explore how data integration needs can be addressed by the semantic web. The main problem that we face in data integration is a lack of widely-accepted standards for expressing the syntax and semantics of the data. We address this problem by exploring the use of semantic web technologies-including resource description framework (RDF), RDF site summary (RSS), relational-database-to-RDF mapping (D2RQ) and native RDF data repository - to represent, store and query both metadata and data across life sciences datasets. Results: As many biological datasets are presently available in tabular format, we introduce an RDF structure into which they can be converted. Also, we develop a prototype web-based application called YeastHub that demonstrates how a life sciences data warehouse can be built using a native RDF data store (Sesame). This data warehouse allows integration of different types of yeast genome data provided by different resources in different formats including the tabular and RDF formats. Once the data are loaded into the data warehouse, RDF-based queries can be formulated to retrieve and query the data in an integrated fashion.
引用
收藏
页码:I85 / I96
页数:12
相关论文
共 17 条
[1]   The Biomolecular Interaction Network Database and related tools 2005 update [J].
Alfarano, C ;
Andrade, CE ;
Anthony, K ;
Bahroos, N ;
Bajec, M ;
Bantoft, K ;
Betel, D ;
Bobechko, B ;
Boutilier, K ;
Burgess, E ;
Buzadzija, K ;
Cavero, R ;
D'Abreo, C ;
Donaldson, I ;
Dorairajoo, D ;
Dumontier, MJ ;
Dumontier, MR ;
Earles, V ;
Farrall, R ;
Feldman, H ;
Garderman, E ;
Gong, Y ;
Gonzaga, R ;
Grytsan, V ;
Gryz, E ;
Gu, V ;
Haldorsen, E ;
Halupa, A ;
Haw, R ;
Hrvojic, A ;
Hurrell, L ;
Isserlin, R ;
Jack, F ;
Juma, F ;
Khan, A ;
Kon, T ;
Konopinsky, S ;
Le, V ;
Lee, E ;
Ling, S ;
Magidin, M ;
Moniakis, J ;
Montojo, J ;
Moore, S ;
Muskat, B ;
Ng, I ;
Paraiso, JP ;
Parker, B ;
Pintilie, G ;
Pirone, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D418-D424
[2]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   A knowledge based approach for representing and reasoning about signaling networks [J].
Baral, C. ;
Chancellor, K. ;
Tran, N. ;
Tran, N. L. ;
Joy, A. ;
Berens, M. .
BIOINFORMATICS, 2004, 20 :15-22
[5]  
CHEUNG K, 2004, P INT C MATH ENG TEC, P236
[6]   The semantic Web:: The roles of XML and RDF [J].
Decker, S ;
Melnik, S ;
Van Harmelen, F ;
Fensel, D ;
Klein, M ;
Broekstra, J ;
Erdmann, M ;
Horrocks, I .
IEEE INTERNET COMPUTING, 2000, 4 (05) :63-74
[7]   The biopolymer markup language [J].
Fenyö, D .
BIOINFORMATICS, 1999, 15 (04) :339-340
[8]  
GOLDBECK J, 2003, J WEB SEMANT, V1, P1
[9]   Topological and causal structure of the yeast transcriptional regulatory network [J].
Guelzim, N ;
Bottani, S ;
Bourgine, P ;
Képès, F .
NATURE GENETICS, 2002, 31 (01) :60-63
[10]  
Hanisch Daniel, 2002, In Silico Biology, V2, P313