A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data

被引:38
|
作者
Rajagopalan, Ramakrishnan [1 ,2 ]
Murrell, Jill R. [1 ,3 ]
Luo, Minjie [1 ,3 ]
Conlin, Laura K. [1 ,3 ]
机构
[1] Childrens Hosp Philadelphia, Dept Pathol & Lab Med, Div Genom Diagnost, Philadelphia, PA 19104 USA
[2] Drexel Univ, Sch Biomed Engn Sci & Hlth Syst, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Pathol & Lab Med, Perelman Sch Med, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
Clinical exome sequencing; Copy-number variation; DETECTION TOOLS; DISCOVERY; RESOURCE; GENES; MODEL; SNP;
D O I
10.1186/s13073-020-0712-0
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background Exome sequencing (ES) is a first-tier diagnostic test for many suspected Mendelian disorders. While it is routine to detect small sequence variants, it is not a standard practice in clinical settings to detect germline copy-number variants (CNVs) from ES data due to several reasons relating to performance. In this work, we comprehensively characterized one of the most sensitive ES-based CNV tools, ExomeDepth, against SNP array, a standard of care test in clinical settings to detect genome-wide CNVs. Methods We propose a modified ExomeDepth workflow by excluding exons with low mappability prior to variant calling to drastically reduce the false positives originating from the repetitive regions of the genome, and an iterative variant calling framework to assess the reproducibility. We used a cohort of 307 individuals with clinical ES data and clinical SNP array to estimate the sensitivity and false discovery rate of the CNV detection using exome sequencing. Further, we performed targeted testing of the STRC gene in 1972 individuals. To reduce the number of variants for downstream analysis, we performed a large-scale iterative variant calling process with random control cohorts to assess the reproducibility of the CNVs. Results The modified workflow presented in this paper reduced the number of total variants identified by one third while retaining a higher sensitivity of 97% and resulted in an improved false discovery rate of 11.4% compared to the default ExomeDepth pipeline. The exclusion of exons with low mappability removes 4.5% of the exons, including a subset of exons (0.6%) in disease-associated genes which are intractable by short-read next-generation sequencing (NGS). Results from the reproducibility analysis showed that the clinically reported variants were reproducible 100% of the time and that the modified workflow can be used to rank variants from high to low confidence. Targeted testing of 30 CNVs identified in STRC, a challenging gene to ascertain by NGS, showed a 100% validation rate. Conclusions In summary, we introduced a modification to the default ExomeDepth workflow to reduce the false positives originating from the repetitive regions of the genome, created a large-scale iterative variant calling framework for reproducibility, and provided recommendations for implementation in clinical settings.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Allele-specific copy-number discovery from whole-genome and whole-exome sequencing
    Wang, WeiBo
    Wang, Wei
    Sun, Wei
    Crowley, James J.
    Szatkiewicz, Jin P.
    NUCLEIC ACIDS RESEARCH, 2015, 43 (14)
  • [22] ECOLE: Learning to call copy number variants on whole exome sequencing data
    Mandiracioglu, Berk
    Ozden, Furkan
    Kaynar, Gun
    Yilmaz, Mehmet Alper
    Alkan, Can
    Cicek, A. Ercument
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [23] ECOLE: Learning to call copy number variants on whole exome sequencing data
    Berk Mandiracioglu
    Furkan Ozden
    Gun Kaynar
    Mehmet Alper Yilmaz
    Can Alkan
    A. Ercument Cicek
    Nature Communications, 15
  • [24] Identification of copy number variants from exome sequence data
    Samarakoon, Pubudu Saneth
    Sorte, Hanne Sormo
    Kristiansen, Bjorn Evert
    Skodje, Tove
    Sheng, Ying
    Tjonnfjord, Geir E.
    Stadheim, Barbro
    Stray-Pedersen, Asbjorg
    Rodningen, Olaug Kristin
    Lyle, Robert
    BMC GENOMICS, 2014, 15
  • [25] Identification of copy number variants from exome sequence data
    Pubudu Saneth Samarakoon
    Hanne Sørmo Sorte
    Bjørn Evert Kristiansen
    Tove Skodje
    Ying Sheng
    Geir E Tjønnfjord
    Barbro Stadheim
    Asbjørg Stray-Pedersen
    Olaug Kristin Rødningen
    Robert Lyle
    BMC Genomics, 15
  • [26] Estimation of Copy Number Alterations from Exome Sequencing Data
    Valdes-Mas, Rafael
    Bea, Silvia
    Puente, Diana A.
    Lopez-Otin, Carlos
    Puente, Xose S.
    PLOS ONE, 2012, 7 (12):
  • [27] Rare copy-number variants as modulators of common disease susceptibility
    Chiara Auwerx
    Maarja Jõeloo
    Marie C. Sadler
    Nicolò Tesio
    Sven Ojavee
    Charlie J. Clark
    Reedik Mägi
    Alexandre Reymond
    Zoltán Kutalik
    Genome Medicine, 16
  • [28] Rare copy-number variants as modulators of common disease susceptibility
    Auwerx, Chiara
    Joeloo, Maarja
    Sadler, Marie C.
    Tesio, Nicolo
    Ojavee, Sven
    Clark, Charlie J.
    Magi, Reedik
    Reymond, Alexandre
    Kutalik, Zoltan
    GENOME MEDICINE, 2024, 16 (01)
  • [29] Phenotypic Heterogeneity of Genomic Disorders and Rare Copy-Number Variants
    Girirajan, Santhosh
    Rosenfeld, Jill A.
    Coe, Bradley P.
    Parikh, Sumit
    Friedman, Neil
    Goldstein, Amy
    Filipink, Robyn A.
    McConnell, Juliann S.
    Angle, Brad
    Meschino, Wendy S.
    Nezarati, Marjan M.
    Asamoah, Alexander
    Jackson, Kelly E.
    Gowans, Gordon C.
    Martin, Judith A.
    Carmany, Erin P.
    Stockton, David W.
    Schnur, Rhonda E.
    Penney, Lynette S.
    Martin, Donna M.
    Raskin, Salmo
    Leppig, Kathleen
    Thiese, Heidi
    Smith, Rosemarie
    Aberg, Erika
    Niyazov, Dmitriy M.
    Escobar, Luis F.
    El-Khechen, Dima
    Johnson, Kisha D.
    Lebel, Robert R.
    Siefkas, Kiana
    Ball, Susie
    Shur, Natasha
    McGuire, Marianne
    Brasington, Campbell K.
    Spence, J. Edward
    Martin, Laura S.
    Clericuzio, Carol
    Ballif, Blake C.
    Shaffer, Lisa G.
    Eichler, Evan E.
    NEW ENGLAND JOURNAL OF MEDICINE, 2012, 367 (14): : 1321 - 1331
  • [30] Rare germline copy number variants in colorectal cancer predisposition characterized by exome sequencing analysis
    Sebastià Franch-Expósito
    Clara Esteban-Jurado
    Pilar Garre
    Isabel Quintanilla
    Saray Duran-Sanchon
    Marcos Díaz-Gay
    Laia Bonjoch
    Miriam Cuatrecasas
    Esther Samper
    Jenifer Muoz
    Teresa Ocaa
    Sabela Carballal
    María López-Cerón
    Antoni Castells
    Maria Vila-Casadesús
    Sophia Derdak
    Steven Laurie
    Sergi Beltran
    Jaime Carvajal
    Luis Bujanda
    Clara Ruiz-Ponte
    Jordi Camps
    Meritxell Gironella
    Juan José Lozano
    Francesc Balaguer
    Joaquín Cubiella
    Trinidad Caldés
    Sergi Castellví-Bel
    JournalofGeneticsandGenomics, 2018, 45 (01) : 41 - 45