A 2-phased approach for detecting multiple loci associations with traits

被引:1
|
作者
Lee, Sunwon [1 ]
Kang, Jaewoo [1 ]
Oh, Junho [1 ]
机构
[1] Korea Univ, Coll Informat & Commun, Seoul 136713, South Korea
基金
新加坡国家研究基金会;
关键词
TF-IDF; term frequency - inverse document frequency; class association rule mining; GWAS; SNP; bioinformatics; apriori algorithm; data mining; GENOME-WIDE ASSOCIATION; INFERENCE; SNPS;
D O I
10.1504/IJDMB.2012.049318
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The recent advance in SNP genotyping has made a significant contribution to reduction of the costs for large-scale genotyping. The development also has dramatically increased the size of the SNP genotype data. The increase in the volume of the data, however, has posed a huge obstacle to the conventional analysis techniques that are typically vulnerable to the high-dimensionality problem. To address the issue, we propose a method that exploits two well-tested models: the document-term model and the transaction analysis model. The proposed method consists of two phases. In the first phase, we reduce the dimensions of the SNP genotype data by extracting significant SNPs through transformation of the data in lieu of the document-term model. In the second phase, we discover the association rules that signify the relations between the SNPs and the traits, through the application of transactional analysis in the reduced-dimension genotype data. We validated the discovered rules through literature survey. Experiments were also carried out using the HGDP panel data provided by the Foundation Jean Dausset-CEPH, which prove the validity of our new method for identifying appropriate dimensional reduction and associations of multiple SNPs and traits. This paper is an extended version of our workshop paper presented in the 2010 International Workshop on Data Mining for High Throughput Data from Genome-Wide Association Studies.
引用
收藏
页码:535 / 556
页数:22
相关论文
共 50 条
  • [41] Multiple sclerosis and personality traits: associations with depression and anxiety
    Saeed Vaheb
    Yousef Mokary
    Mohammad Yazdan Panah
    Aysa Shaygannejad
    Alireza Afshari-Safavi
    Majid Ghasemi
    Vahid Shaygannejad
    Elham Moases Ghaffary
    Omid Mirmosayyeb
    European Journal of Medical Research, 29
  • [43] THE ELICITATION AND SIGNAL-TRANSDUCTION PATHWAYS INVOLVED IN THE 2-PHASED ACTIVE OXYGEN RESPONSE DURING PLANT BACTERIA INTERACTIONS
    ORLANDI, EW
    MOCK, NM
    BAKER, CJ
    JOURNAL OF CELLULAR BIOCHEMISTRY, 1995, : 489 - 489
  • [44] A combinatorial searching method for detecting a set of interacting loci associated with complex traits.
    Zhang, S
    Sha, Q
    Cooper, R
    Zhu, X
    AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (05) : 610 - 610
  • [45] A hierarchical Bayesian approach for detecting global microbiome associations
    Hatami, Farhad
    Beamish, Emma
    Davies, Albert
    Rigby, Rachael
    Dondelinger, Frank
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2021, 20 (03) : 85 - 100
  • [46] Quantitative trait loci for water-soluble carbohydrates and associations with agronomic traits in wheat
    Rebetzke, G. J.
    van Herwaarden, A. F.
    Jenkins, C.
    Weiss, M.
    Lewis, D.
    Ruuska, S.
    Tabe, L.
    Fettell, N. A.
    Richards, R. A.
    AUSTRALIAN JOURNAL OF AGRICULTURAL RESEARCH, 2008, 59 (10): : 891 - 905
  • [47] Quantitative Trait Loci and Antagonistic Associations for Two Developmentally Related Traits in the Drosophila Head
    Norry, Fabian M.
    Gomez, Federico H.
    JOURNAL OF INSECT SCIENCE, 2017, 17
  • [48] Toward Multiple SNP Motif Analyses of Loci Associated With Phenotypic Traits
    Gallo, Juan E.
    Misas, Elizabeth
    McEwen, Juan G.
    Clay, Oliver K.
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2017, 70 (12) : 1539 - 1540
  • [49] 2-PHASED CONTROL PROGRAM DESIGNED FOR MAXIMUM SUPPRESSION OF BOLL WEEVIL COLEOPTERA-CURCULIONIDAE IN HIGH AND ROLLING PLAINS OF TEXAS
    RUMMEL, DR
    ADKISSON, PL
    JOURNAL OF ECONOMIC ENTOMOLOGY, 1971, 64 (04) : 919 - &
  • [50] A novel computational approach for detecting epistasis in a simulation model of multiple loci using discordant siblings: A proposal for the study of schizophrenia
    Estrada, JK
    Nicolini, H
    Vallejo, E
    Fresan, A
    De la Fuente, C
    Meyemberg, N
    AMERICAN JOURNAL OF MEDICAL GENETICS PART B-NEUROPSYCHIATRIC GENETICS, 2004, 130B (01): : 154 - 154