Regularized multi-trait multi-locus linear mixed models for genome-wide association studies and genomic selection in crops

被引:2
|
作者
Lozano, Aurelie C. [1 ]
Ding, Hantian [2 ]
Abe, Naoki [1 ]
Lipka, Alexander E. [3 ]
机构
[1] IBM TJ Watson Res Ctr, IBM Res, Yorktown Hts, NY USA
[2] Univ Penn, Philadelphia, PA USA
[3] Univ Illinois, Dept Crop Sci, Champaign, IL 61820 USA
关键词
Multi-trait multi-locus linear mixed model; GWAS and genomic selection in plants; Regularization; GENETIC-HETEROGENEITY; VARIABLE SELECTION; POPULATION; PREDICTION; REGRESSION;
D O I
10.1186/s12859-023-05519-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundWe consider two key problems in genomics involving multiple traits: multi-trait genome wide association studies (GWAS), where the goal is to detect genetic variants associated with the traits; and multi-trait genomic selection (GS), where the emphasis is on accurately predicting trait values. Multi-trait linear mixed models build on the linear mixed model to jointly model multiple traits. Existing estimation methods, however, are limited to the joint analysis of a small number of genotypes; in fact, most approaches consider one SNP at a time. Estimating multi-dimensional genetic and environment effects also results in considerable computational burden. Efficient approaches that incorporate regularization into multi-trait linear models (no random effects) have been recently proposed to identify genomic loci associated with multiple traits (Yu et al. in Multitask learning using task clustering with applications to predictive modeling and GWAS of plant varieties. arXiv:1710.01788, 2017; Yu et al in Front Big Data 2:27, 2019), but these ignore population structure and familial relatedness (Yu et al in Nat Genet 38:203-208, 2006).ResultsThis work addresses this gap by proposing a novel class of regularized multi-trait linear mixed models along with scalable approaches for estimation in the presence of high-dimensional genotypes and a large number of traits. We evaluate the effectiveness of the proposed methods using datasets in maize and sorghum diversity panels, and demonstrate benefits in both achieving high prediction accuracy in GS and in identifying relevant marker-trait associations.ConclusionsThe proposed regularized multivariate linear mixed models are relevant for both GWAS and GS. We hope that they will facilitate agronomy-related research in plant biology and crop breeding endeavors.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Regularized multi-trait multi-locus linear mixed models for genome-wide association studies and genomic selection in crops
    Aurélie C. Lozano
    Hantian Ding
    Naoki Abe
    Alexander E. Lipka
    BMC Bioinformatics, 24
  • [2] Methodological implementation of mixed linear models in multi-locus genome-wide association studies
    Wen, Yang-Jun
    Zhang, Hanwen
    Ni, Yuan-Li
    Huang, Bo
    Zhang, Jin
    Feng, Jian-Ying
    Wang, Shi-Bo
    Dunwell, Jim M.
    Zhang, Yuan-Ming
    Wu, Rongling
    BRIEFINGS IN BIOINFORMATICS, 2018, 19 (04) : 700 - 712
  • [3] Methodological implementation of mixed linear models in multi-locus genome-wide association studies (bbw145, 2016)
    Wen, Yang-Jun
    Zhang, Hanwen
    Ni, Yuan-Li
    Huang, Bo
    Zhang, Jin
    Feng, Jian-Ying
    Wang, Shi-Bo
    Dunwell, Jim M.
    Zhang, Yuan-Ming
    Wu, Rongling
    BRIEFINGS IN BIOINFORMATICS, 2017, 18 (05) : 906 - 906
  • [4] Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology
    Shi-Bo Wang
    Jian-Ying Feng
    Wen-Long Ren
    Bo Huang
    Ling Zhou
    Yang-Jun Wen
    Jin Zhang
    Jim M. Dunwell
    Shizhong Xu
    Yuan-Ming Zhang
    Scientific Reports, 6
  • [5] Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology
    Wang, Shi-Bo
    Feng, Jian-Ying
    Ren, Wen-Long
    Huang, Bo
    Zhou, Ling
    Wen, Yang-Jun
    Zhang, Jin
    Dunwell, Jim M.
    Xu, Shizhong
    Zhang, Yuan-Ming
    SCIENTIFIC REPORTS, 2016, 6
  • [6] Evaluation of multi-locus models for genome-wide association studies: a case study in sugar beet
    Wuerschum, T.
    Kraft, T.
    HEREDITY, 2015, 114 (03) : 281 - 290
  • [7] Evaluation of multi-locus models for genome-wide association studies: a case study in sugar beet
    T Würschum
    T Kraft
    Heredity, 2015, 114 : 281 - 290
  • [8] Utilizing trait networks and structural equation models as tools to interpret multi-trait genome-wide association studies
    Momen, Mehdi
    Campbell, Malachy T.
    Walia, Harkamal
    Morota, Gota
    PLANT METHODS, 2019, 15 (01)
  • [9] Utilizing trait networks and structural equation models as tools to interpret multi-trait genome-wide association studies
    Mehdi Momen
    Malachy T. Campbell
    Harkamal Walia
    Gota Morota
    Plant Methods, 15
  • [10] Multi-locus genome-wide association study and genomic prediction for flowering time in chrysanthemum
    Su, Jiangshuo
    Lu, Zhaowen
    Zeng, Junwei
    Zhang, Xuefeng
    Yang, Xiuwei
    Wang, Siyue
    Zhang, Fei
    Jiang, Jiafu
    Chen, Fadi
    PLANTA, 2024, 259 (01)