Detecting inherited and novel structural variants in low-coverage parent-child sequencing data

被引:3
|
作者
Spence, Melissa [1 ]
Banuelos, Mario [2 ]
Marcia, Roummel F. [1 ]
Sindi, Suzanne [1 ]
机构
[1] Univ Calif Merced, Dept Appl Math, Merced, CA 95343 USA
[2] Calif State Univ Fresno, Dept Math, Fresno, CA 93740 USA
关键词
Sparse signal recovery; Convex optimization; Next-generation sequencing data; Structural variants; Computational genomics; HUMAN GENOME; PAIRED-END; CANCER; IMPACT; BREAST;
D O I
10.1016/j.ymeth.2019.06.025
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Structural variants (SVs) are a class of genomic variation shared by members of the same species. Though relatively rare, they represent an increasingly important class of variation, as SVs have been associated with diseases and susceptibility to some types of cancer. Common approaches to SV detection require the sequencing and mapping of fragments from a test genome to a high-quality reference genome. Candidate SVs correspond to fragments with discordant mapped configurations. However, because errors in the sequencing and mapping will also create discordant arrangements, many of these predictions will be spurious. When sequencing coverage is low, distinguishing true SVs from errors is even more challenging. In recent work, we have developed SV detection methods that exploit genome information of closely related individuals - parents and children. Our previous approaches were based on the assumption that any SV present in a child's genome must have come from one of their parents. However, using this strict restriction may have resulted in failing to predict any rare but novel variants present only in the child. In this work, we generalize our previous approaches to allow the child to carry novel variants. We consider a constrained optimization approach where variants in the child are of two types either inherited - and therefore must be present in a parent - or novel. For simplicity, we consider only a single parent and single child each of which have a haploid genome. However, even in this restricted case, our approach has the power to improve variant prediction. We present results on both simulated candidate variant regions, parent-child trios from the 1000 Genomes Project, and a subset of the 17 Platinum Genomes.
引用
收藏
页码:61 / 68
页数:8
相关论文
共 50 条
  • [41] Population assignment from genotype likelihoods for low-coverage whole-genome sequencing data
    Desaix, Matthew G.
    Rodriguez, Marina D.
    Ruegg, Kristen C.
    Anderson, Eric C.
    METHODS IN ECOLOGY AND EVOLUTION, 2024, 15 (03): : 493 - 510
  • [42] An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data
    Jin Zhang
    Jiayin Wang
    Yufeng Wu
    BMC Bioinformatics, 13
  • [43] NanoSNP: a progressive and haplotype-aware SNP caller on low-coverage nanopore sequencing data
    Huang, Neng
    Xu, Minghua
    Nie, Fan
    Ni, Peng
    Xiao, Chuan-Le
    Luo, Feng
    Wang, Jianxin
    BIOINFORMATICS, 2023, 39 (01)
  • [44] Fast and Accurate 1000 Genomes Imputation Using Summary Statistics or Low-coverage Sequencing Data
    Pasaniuc, Bogdan
    Zaitlen, Noah
    Bhatia, Gaurav
    Gusev, Alexander
    Patterson, Nick
    Price, Alkes L.
    GENETIC EPIDEMIOLOGY, 2012, 36 (07) : 765 - 765
  • [45] An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data
    Zhang, Jin
    Wang, Jiayin
    Wu, Yufeng
    BMC BIOINFORMATICS, 2012, 13
  • [46] Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm
    José Ignacio Lucas-Lledó
    David Vicente-Salvador
    Cristina Aguado
    Mario Cáceres
    BMC Bioinformatics, 15
  • [47] Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm
    Ignacio Lucas-Lledo, Jose
    Vicente-Salvador, David
    Aguado, Cristina
    Caceres, Mario
    BMC BIOINFORMATICS, 2014, 15
  • [48] Accurate genotype imputation from low-coverage whole-genome sequencing data of rainbow trout
    Liu, Sixin
    Martin, Kyle E.
    Snelling, Warren M.
    Long, Roseanna
    Leeds, Timothy D.
    Vallejo, Roger L.
    Wiens, Gregory D.
    Palti, Yniv
    G3-GENES GENOMES GENETICS, 2024, 14 (09):
  • [49] Publisher Correction: Efficient phasing and imputation of low-coverage sequencing data using large reference panels
    Simone Rubinacci
    Diogo M. Ribeiro
    Robin J. Hofmeister
    Olivier Delaneau
    Nature Genetics, 2021, 53 : 412 - 412
  • [50] A computational approach for positive genetic identification and relatedness detection from low-coverage shotgun sequencing data
    Nguyen, Remy
    Kapp, Joshua D.
    Sacco, Samuel
    Myers, Steven P.
    Green, Richard E.
    JOURNAL OF HEREDITY, 2023, : 504 - 512