Haplotype-based linkage disequilibrium mapping via direct data mining

被引:33
作者
Li, J [1 ]
Jiang, T
机构
[1] Case Western Reserve Univ, Dept Elect Engn & Comp Sci, Cleveland, OH 44106 USA
[2] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
[3] Tsinghua Univ, Ctr Adv Study, Beijing 100084, Peoples R China
[4] Shanghai Ctr Bioinformat Technol, Shanghai, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/bti732
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: With the availability of large-scale, high-density single-nucleotide polymorphism markers and information on haplotype structures and frequencies, a great challenge is how to take advantage of haplotype information in the association mapping of complex diseases in case-control studies. Results: We present a novel approach for association mapping based on directly mining haplotypes (i.e. phased genotype pairs) produced from case-control data or case-parent data via a density-based clustering algorithm, which can be applied to whole-genome screens as well as candidate-gene studies in small genomic regions. The method directly explores the sharing of haplotype segments in affected individuals that are rarely present in normal individuals. The measure of sharing between two haplotypes is defined by a new similarity metric that combines the length of the shared segments and the number of common alleles around any marker position of the haplotypes, which is robust against recent mutations/genotype errors and recombination events. The effectiveness of the approach is demonstrated by using both simulated datasets and real datasets. The results show that the algorithm is accurate for different population models and for different disease models, even for genes with small effects, and it outperforms some recently developed methods.
引用
收藏
页码:4384 / 4393
页数:10
相关论文
共 25 条
[1]  
Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
[2]   High-resolution haplotype structure in the human genome [J].
Daly, MJ ;
Rioux, JD ;
Schaffner, SE ;
Hudson, TJ ;
Lander, ES .
NATURE GENETICS, 2001, 29 (02) :229-232
[3]   Genomic control for association studies [J].
Devlin, B ;
Roeder, K .
BIOMETRICS, 1999, 55 (04) :997-1004
[4]   Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes [J].
Durrant, C ;
Zondervan, KT ;
Cardon, LR ;
Hunt, S ;
Deloukas, P ;
Morris, AP .
AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 75 (01) :35-43
[5]  
Ester M., 1996, Proc. Second Int. Conf. Knowl. Discov. Data Min, P226, DOI DOI 10.5555/3001460.3001507
[6]  
FIENBERG SE, 1977, ANAL CROSS CLASSIFIE
[7]   Model-based clustering, discriminant analysis, and density estimation [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (458) :611-631
[8]   The structure of haplotype blocks in the human genome [J].
Gabriel, SB ;
Schaffner, SF ;
Nguyen, H ;
Moore, JM ;
Roy, J ;
Blumenstiel, B ;
Higgins, J ;
DeFelice, M ;
Lochner, A ;
Faggart, M ;
Liu-Cordero, SN ;
Rotimi, C ;
Adeyemo, A ;
Cooper, R ;
Ward, R ;
Lander, ES ;
Daly, MJ ;
Altshuler, D .
SCIENCE, 2002, 296 (5576) :2225-2229
[9]  
Gusfield D., 1997, ALGORITHMS STRINGS T
[10]  
Han J., 2006, Data Mining: Concepts and Techniques, V340, P93205