An Efficient Nonlinear Regression Approach for Genome-wide Detection of Marginal and Interacting Genetic Variations

被引:2
|
作者
Lee, Seunghak [1 ]
Lozano, Aurelie [2 ]
Kambadur, Prabhanjan [3 ]
Xing, Eric P. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, 5000 Forbes Ave, Pittsburgh, PA 15217 USA
[2] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY USA
[3] Bloomberg LP, New York, NY USA
关键词
genome-wide association mapping; SNP-SNP interaction; piecewise linear model screening; stability selection; group lasso; ALZHEIMERS-DISEASE; LATE-ONSET; ASSOCIATION; LASSO; DOPAMINE;
D O I
10.1089/cmb.2015.0202
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies have revealed individual genetic variants associated with phenotypic traits such as disease risk and gene expressions. However, detecting pairwise interaction effects of genetic variants on traits still remains a challenge due to a large number of combinations of variants (approximate to 10(11) SNP pairs in the human genome), and relatively small sample sizes (typically <10(4)). Despite recent breakthroughs in detecting interaction effects, there are still several open problems, including: (1) how to quickly process a large number of SNP pairs, (2) how to distinguish between true signals and SNPs/SNP pairs merely correlated with true signals, (3) how to detect nonlinear associations between SNP pairs and traits given small sample sizes, and (4) how to control false positives. In this article, we present a unified framework, called SPHINX, which addresses the aforementioned challenges. We first propose a piecewise linear model for interaction detection, because it is simple enough to estimate model parameters given small sample sizes but complex enough to capture nonlinear interaction effects. Then, based on the piecewise linear model, we introduce randomized group lasso under stability selection, and a screening algorithm to address the statistical and computational challenges mentioned above. In our experiments, we first demonstrate that SPHINX achieves better power than existing methods for interaction detection under false positive control. We further applied SPHINX to late-onset Alzheimer's disease dataset, and report 16 SNPs and 17 SNP pairs associated with gene traits. We also present a highly scalable implementation of our screening algorithm, which can screen approximate to 118 billion candidates of associations on a 60-node cluster in <5.5 hours.
引用
收藏
页码:372 / 389
页数:18
相关论文
共 50 条
  • [41] A genome-wide approach to skin aging
    Le Clerc, S.
    Tiang, L.
    Ezzedine, K.
    Bernard, A.
    Latreille, J.
    Malvy, D.
    Jdid, R.
    Galan, P.
    Hercberg, S.
    Morizot, F.
    Guinot, C.
    Tschachler, E.
    Zagury, J.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2013, 133 : S144 - S144
  • [42] Population size in QTL detection using quantile regression in genome-wide association studies
    Gabriela França Oliveira
    Ana Carolina Campana Nascimento
    Camila Ferreira Azevedo
    Maurício de Oliveira Celeri
    Laís Mayara Azevedo Barroso
    Isabela de Castro Sant’Anna
    José Marcelo Soriano Viana
    Marcos Deon Vilela de Resende
    Moysés Nascimento
    Scientific Reports, 13
  • [43] Population size in QTL detection using quantile regression in genome-wide association studies
    Oliveira, Gabriela Franca
    Nascimento, Ana Carolina Campana
    Azevedo, Camila Ferreira
    Celeri, Mauricio de Oliveira
    Barroso, Lais Mayara Azevedo
    Sant'Anna, Isabela de Castro
    Viana, Jose Marcelo Soriano
    de Resende, Marcos Deon Vilela
    Nascimento, Moyses
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [44] Explainable Machine Learning Model for Alzheimer Detection Using Genetic Data: A Genome-Wide Association Study Approach
    Khater, Tarek
    Ansari, Sam
    Saad Alatrany, Abbas
    Alaskar, Haya
    Mahmoud, Soliman
    Turky, Ayad
    Tawfik, Hissam
    Almajali, Eqab
    Hussain, Abir
    IEEE ACCESS, 2024, 12 : 95091 - 95105
  • [45] Genome-wide association methods reveal genetic contributions to visual detection of orientation
    Goodbourn, P. T.
    Bargary, G.
    Bosten, J. M.
    Hogg, R. E.
    Lawrance-Owen, A. J.
    Mollon, J. D.
    PERCEPTION, 2012, 41 : 226 - 226
  • [46] Detection of functional protein domains by unbiased genome-wide forward genetic screening
    Mareike Herzog
    Fabio Puddu
    Julia Coates
    Nicola Geisler
    Josep V. Forment
    Stephen P. Jackson
    Scientific Reports, 8
  • [47] Detection of functional protein domains by unbiased genome-wide forward genetic screening
    Herzog, Mareike
    Puddu, Fabio
    Coates, Julia
    Geisler, Nicola
    Forment, Josep V.
    Jackson, Stephen P.
    SCIENTIFIC REPORTS, 2018, 8
  • [48] DELISHUS: an efficient and exact algorithm for genome-wide detection of deletion polymorphism in autism
    Aguiar, Derek
    Halldorsson, Bjarni V.
    Morrow, Eric M.
    Istrail, Sorin
    BIOINFORMATICS, 2012, 28 (12) : I154 - I162
  • [49] A genome-wide detection of copy number variations using SNP genotyping arrays in swine
    Jiying Wang
    Jicai Jiang
    Weixuan Fu
    Li Jiang
    Xiangdong Ding
    Jian-Feng Liu
    Qin Zhang
    BMC Genomics, 13
  • [50] Deshrinking ridge regression for genome-wide association studies
    Wang, Meiyue
    Li, Ruidong
    Xu, Shizhong
    BIOINFORMATICS, 2020, 36 (14) : 4154 - 4162