Detecting inherited and novel structural variants in low-coverage parent-child sequencing data

被引:3
|
作者
Spence, Melissa [1 ]
Banuelos, Mario [2 ]
Marcia, Roummel F. [1 ]
Sindi, Suzanne [1 ]
机构
[1] Univ Calif Merced, Dept Appl Math, Merced, CA 95343 USA
[2] Calif State Univ Fresno, Dept Math, Fresno, CA 93740 USA
关键词
Sparse signal recovery; Convex optimization; Next-generation sequencing data; Structural variants; Computational genomics; HUMAN GENOME; PAIRED-END; CANCER; IMPACT; BREAST;
D O I
10.1016/j.ymeth.2019.06.025
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Structural variants (SVs) are a class of genomic variation shared by members of the same species. Though relatively rare, they represent an increasingly important class of variation, as SVs have been associated with diseases and susceptibility to some types of cancer. Common approaches to SV detection require the sequencing and mapping of fragments from a test genome to a high-quality reference genome. Candidate SVs correspond to fragments with discordant mapped configurations. However, because errors in the sequencing and mapping will also create discordant arrangements, many of these predictions will be spurious. When sequencing coverage is low, distinguishing true SVs from errors is even more challenging. In recent work, we have developed SV detection methods that exploit genome information of closely related individuals - parents and children. Our previous approaches were based on the assumption that any SV present in a child's genome must have come from one of their parents. However, using this strict restriction may have resulted in failing to predict any rare but novel variants present only in the child. In this work, we generalize our previous approaches to allow the child to carry novel variants. We consider a constrained optimization approach where variants in the child are of two types either inherited - and therefore must be present in a parent - or novel. For simplicity, we consider only a single parent and single child each of which have a haploid genome. However, even in this restricted case, our approach has the power to improve variant prediction. We present results on both simulated candidate variant regions, parent-child trios from the 1000 Genomes Project, and a subset of the 17 Platinum Genomes.
引用
收藏
页码:61 / 68
页数:8
相关论文
共 50 条
  • [31] Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data
    Ozdemir Ozdogan, Gulistan
    Kaya, Hilal
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2020, 12 (03) : 302 - 310
  • [32] Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data
    Gülistan Özdemir Özdoğan
    Hilal Kaya
    Interdisciplinary Sciences: Computational Life Sciences, 2020, 12 : 302 - 310
  • [33] Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations
    Martin, Alicia R.
    Atkinson, Elizabeth G.
    Chapman, Sinead B.
    Stevenson, Anne
    Stroud, Rocky E.
    Abebe, Tamrat
    Akena, Dickens
    Alemayehu, Melkam
    Ashaba, Fred K.
    Atwoli, Lukoye
    Bowers, Tera
    Chibnik, Lori B.
    Daly, Mark J.
    DeSmet, Timothy
    Dodge, Sheila
    Fekadu, Abebaw
    Ferriera, Steven
    Gelaye, Bizu
    Gichuru, Stella
    Injera, Wilfred E.
    James, Roxanne
    Kariuki, Symon M.
    Kigen, Gabriel
    Koenen, Karestan C.
    Kwobah, Edith
    Kyebuzibwa, Joseph
    Majara, Lerato
    Musinguzi, Henry
    Mwema, Rehema M.
    Neale, Benjamin M.
    Newman, Carter P.
    Newton, Charles R. J. C.
    Pickrell, Joseph K.
    Ramesar, Raj
    Shiferaw, Welelta
    Stein, Dan J.
    Teferra, Solomon
    van der Merwe, Celia
    Zingela, Zukiswa
    AMERICAN JOURNAL OF HUMAN GENETICS, 2021, 108 (04) : 656 - 668
  • [34] AKSmooth: Enhancing low-coverage bisulfite sequencing data via kernel-based smoothing
    Chen, Junfang
    Lutsik, Pavlo
    Akulenko, Ruslan
    Walter, Joern
    Helms, Volkhard
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014, 12 (06)
  • [35] PMAT: an efficient plant mitogenome assembly toolkit using low-coverage HiFi sequencing data
    Bi, Changwei
    Shen, Fei
    Han, Fuchuan
    Qu, Yanshu
    Hou, Jing
    Xu, Kewang
    Xu, Li-an
    He, Wenchuang
    Wu, Zhiqiang
    Yin, Tongming
    HORTICULTURE RESEARCH, 2024, 11 (03)
  • [36] Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
    Rubinacci, Simone
    Hofmeister, Robin J.
    da Mota, Barbara Sousa
    Delaneau, Olivier
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 50 - 50
  • [37] Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data
    Deng, Tianyu
    Zhang, Pengfei
    Garrick, Dorian
    Gao, Huijiang
    Wang, Lixian
    Zhao, Fuping
    FRONTIERS IN GENETICS, 2022, 12
  • [38] Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
    Simone Rubinacci
    Robin J. Hofmeister
    Bárbara Sousa da Mota
    Olivier Delaneau
    Nature Genetics, 2023, 55 : 1088 - 1090
  • [39] Detecting multiple variants associated with disease based on sequencing data of case–parent trios
    Chan Wang
    Leiming Sun
    Haitao Zheng
    Yue-Qing Hu
    Journal of Human Genetics, 2016, 61 : 851 - 860
  • [40] Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
    Rubinacci, Simone
    Hofmeister, Robin J.
    da Mota, Barbara Sousa
    Delaneau, Olivier
    NATURE GENETICS, 2023, 55 (07) : 1088 - +