A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data

被引:38
|
作者
Rajagopalan, Ramakrishnan [1 ,2 ]
Murrell, Jill R. [1 ,3 ]
Luo, Minjie [1 ,3 ]
Conlin, Laura K. [1 ,3 ]
机构
[1] Childrens Hosp Philadelphia, Dept Pathol & Lab Med, Div Genom Diagnost, Philadelphia, PA 19104 USA
[2] Drexel Univ, Sch Biomed Engn Sci & Hlth Syst, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Pathol & Lab Med, Perelman Sch Med, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
Clinical exome sequencing; Copy-number variation; DETECTION TOOLS; DISCOVERY; RESOURCE; GENES; MODEL; SNP;
D O I
10.1186/s13073-020-0712-0
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background Exome sequencing (ES) is a first-tier diagnostic test for many suspected Mendelian disorders. While it is routine to detect small sequence variants, it is not a standard practice in clinical settings to detect germline copy-number variants (CNVs) from ES data due to several reasons relating to performance. In this work, we comprehensively characterized one of the most sensitive ES-based CNV tools, ExomeDepth, against SNP array, a standard of care test in clinical settings to detect genome-wide CNVs. Methods We propose a modified ExomeDepth workflow by excluding exons with low mappability prior to variant calling to drastically reduce the false positives originating from the repetitive regions of the genome, and an iterative variant calling framework to assess the reproducibility. We used a cohort of 307 individuals with clinical ES data and clinical SNP array to estimate the sensitivity and false discovery rate of the CNV detection using exome sequencing. Further, we performed targeted testing of the STRC gene in 1972 individuals. To reduce the number of variants for downstream analysis, we performed a large-scale iterative variant calling process with random control cohorts to assess the reproducibility of the CNVs. Results The modified workflow presented in this paper reduced the number of total variants identified by one third while retaining a higher sensitivity of 97% and resulted in an improved false discovery rate of 11.4% compared to the default ExomeDepth pipeline. The exclusion of exons with low mappability removes 4.5% of the exons, including a subset of exons (0.6%) in disease-associated genes which are intractable by short-read next-generation sequencing (NGS). Results from the reproducibility analysis showed that the clinically reported variants were reproducible 100% of the time and that the modified workflow can be used to rank variants from high to low confidence. Targeted testing of 30 CNVs identified in STRC, a challenging gene to ascertain by NGS, showed a 100% validation rate. Conclusions In summary, we introduced a modification to the default ExomeDepth workflow to reduce the false positives originating from the repetitive regions of the genome, created a large-scale iterative variant calling framework for reproducibility, and provided recommendations for implementation in clinical settings.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data
    Ramakrishnan Rajagopalan
    Jill R. Murrell
    Minjie Luo
    Laura K. Conlin
    Genome Medicine, 12
  • [2] CANOES: detecting rare copy number variants from whole exome sequencing data
    Backenroth, Daniel
    Homsy, Jason
    Murillo, Laura R.
    Glessner, Joe
    Lin, Edwin
    Brueckner, Martina
    Lifton, Richard
    Goldmuntz, Elizabeth
    Chung, Wendy K.
    Shen, Yufeng
    NUCLEIC ACIDS RESEARCH, 2014, 42 (12)
  • [3] Clinically relevant copy-number variants in exome sequencing data of patients with dystonia
    Zech, Michael
    Boesch, Sylvia
    Skorvanek, Matej
    Necpal, Jan
    Svantnerova, Jana
    Wagner, Matias
    Dincer, Yasemin
    Sadr-Nabavi, Ariane
    Serranova, Teresa
    Rektorova, Irena
    Havrankova, Petra
    Ganai, Shahzaman
    Mosejova, Alexandra
    Prihodova, Iva
    Sarlakova, Jana
    Kulcsarova, Kristina
    Ulmanova, Olga
    Bechyne, Karel
    Ostrozovicova, Miriam
    Han, Vladimir
    Ventosa, Joaquim Ribeiro
    Shariati, Mohammad
    Shoeibi, Ali
    Weber, Sandrina
    Mollenhauer, Brit
    Trenkwalder, Claudia
    Berutti, Riccardo
    Strom, Tim M.
    Ceballos-Baumann, Andres
    Mall, Volker
    Haslinger, Bernhard
    Jech, Robert
    Winkelmann, Juliane
    PARKINSONISM & RELATED DISORDERS, 2021, 84 : 129 - 134
  • [4] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Magi, Alberto
    Tattini, Lorenzo
    Cifola, Ingrid
    D'Aurizio, Romina
    Benelli, Matteo
    Mangano, Eleonora
    Battaglia, Cristina
    Bonora, Elena
    Kurg, Ants
    Seri, Marco
    Magini, Pamela
    Giusti, Betti
    Romeo, Giovanni
    Pippucci, Tommaso
    De Bellis, Gianluca
    Abbate, Rosanna
    Gensini, Gian Franco
    GENOME BIOLOGY, 2013, 14 (10):
  • [5] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Alberto Magi
    Lorenzo Tattini
    Ingrid Cifola
    Romina D’Aurizio
    Matteo Benelli
    Eleonora Mangano
    Cristina Battaglia
    Elena Bonora
    Ants Kurg
    Marco Seri
    Pamela Magini
    Betti Giusti
    Giovanni Romeo
    Tommaso Pippucci
    Gianluca De Bellis
    Rosanna Abbate
    Gian Franco Gensini
    Genome Biology, 14
  • [6] A NEW METHOD FOR DETECTING ASSOCIATIONS WITH RARE COPY-NUMBER VARIANTS
    Szatkiewicz, Jin
    Tzeng, Jung-Ying
    Magnusson, Patrik
    Sullivan, Patrick
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2017, 27 : S165 - S166
  • [7] A New Method for Detecting Associations with Rare Copy-Number Variants
    Tzeng, Jung-Ying
    Magnusson, Patrik K. E.
    Sullivan, Patrick F.
    Szatkiewicz, Jin P.
    PLOS GENETICS, 2015, 11 (10):
  • [8] CLAMMS: a scalable algorithm for calling common and rare copy number variants from exome sequencing data
    Packer, Jonathan S.
    Maxwell, Evan K.
    O'Dushlaine, Colm
    Lopez, Alexander E.
    Dewey, Frederick E.
    Chernomorsky, Rostislav
    Baras, Aris
    Overton, John D.
    Habegger, Lukas
    Reid, Jeffrey G.
    BIOINFORMATICS, 2016, 32 (01) : 133 - 135
  • [9] GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data
    Babadi, Mehrtash
    Fu, Jack M.
    Lee, Samuel K.
    Smirnov, Andrey N.
    Gauthier, Laura D.
    Walker, Mark
    Benjamin, David I.
    Zhao, Xuefang
    Karczewski, Konrad J.
    Wong, Isaac
    Collins, Ryan L.
    Sanchis-Juan, Alba
    Brand, Harrison
    Banks, Eric
    Talkowski, Michael E.
    NATURE GENETICS, 2023, 55 (09) : 1589 - +
  • [10] GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data
    Mehrtash Babadi
    Jack M. Fu
    Samuel K. Lee
    Andrey N. Smirnov
    Laura D. Gauthier
    Mark Walker
    David I. Benjamin
    Xuefang Zhao
    Konrad J. Karczewski
    Isaac Wong
    Ryan L. Collins
    Alba Sanchis-Juan
    Harrison Brand
    Eric Banks
    Michael E. Talkowski
    Nature Genetics, 2023, 55 : 1589 - 1597