A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data

被引:38
|
作者
Rajagopalan, Ramakrishnan [1 ,2 ]
Murrell, Jill R. [1 ,3 ]
Luo, Minjie [1 ,3 ]
Conlin, Laura K. [1 ,3 ]
机构
[1] Childrens Hosp Philadelphia, Dept Pathol & Lab Med, Div Genom Diagnost, Philadelphia, PA 19104 USA
[2] Drexel Univ, Sch Biomed Engn Sci & Hlth Syst, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Pathol & Lab Med, Perelman Sch Med, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
Clinical exome sequencing; Copy-number variation; DETECTION TOOLS; DISCOVERY; RESOURCE; GENES; MODEL; SNP;
D O I
10.1186/s13073-020-0712-0
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background Exome sequencing (ES) is a first-tier diagnostic test for many suspected Mendelian disorders. While it is routine to detect small sequence variants, it is not a standard practice in clinical settings to detect germline copy-number variants (CNVs) from ES data due to several reasons relating to performance. In this work, we comprehensively characterized one of the most sensitive ES-based CNV tools, ExomeDepth, against SNP array, a standard of care test in clinical settings to detect genome-wide CNVs. Methods We propose a modified ExomeDepth workflow by excluding exons with low mappability prior to variant calling to drastically reduce the false positives originating from the repetitive regions of the genome, and an iterative variant calling framework to assess the reproducibility. We used a cohort of 307 individuals with clinical ES data and clinical SNP array to estimate the sensitivity and false discovery rate of the CNV detection using exome sequencing. Further, we performed targeted testing of the STRC gene in 1972 individuals. To reduce the number of variants for downstream analysis, we performed a large-scale iterative variant calling process with random control cohorts to assess the reproducibility of the CNVs. Results The modified workflow presented in this paper reduced the number of total variants identified by one third while retaining a higher sensitivity of 97% and resulted in an improved false discovery rate of 11.4% compared to the default ExomeDepth pipeline. The exclusion of exons with low mappability removes 4.5% of the exons, including a subset of exons (0.6%) in disease-associated genes which are intractable by short-read next-generation sequencing (NGS). Results from the reproducibility analysis showed that the clinically reported variants were reproducible 100% of the time and that the modified workflow can be used to rank variants from high to low confidence. Targeted testing of 30 CNVs identified in STRC, a challenging gene to ascertain by NGS, showed a 100% validation rate. Conclusions In summary, we introduced a modification to the default ExomeDepth workflow to reduce the false positives originating from the repetitive regions of the genome, created a large-scale iterative variant calling framework for reproducibility, and provided recommendations for implementation in clinical settings.
引用
收藏
页数:11
相关论文
共 50 条
  • [11] Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an ‘exome-first’ approach
    Satoko Miyatake
    Eriko Koshimizu
    Atsushi Fujita
    Ryoko Fukai
    Eri Imagawa
    Chihiro Ohba
    Ichiro Kuki
    Megumi Nukui
    Atsushi Araki
    Yoshio Makita
    Tsutomu Ogata
    Mitsuko Nakashima
    Yoshinori Tsurusaki
    Noriko Miyake
    Hirotomo Saitsu
    Naomichi Matsumoto
    Journal of Human Genetics, 2015, 60 : 175 - 182
  • [12] Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an 'exome-first' approach
    Miyatake, Satoko
    Koshimizu, Eriko
    Fujita, Atsushi
    Fukai, Ryoko
    Imagawa, Eri
    Ohba, Chihiro
    Kuki, Ichiro
    Nukui, Megumi
    Araki, Atsushi
    Makita, Yoshio
    Ogata, Tsutomu
    Nakashima, Mitsuko
    Tsurusaki, Yoshinori
    Miyake, Noriko
    Saitsu, Hirotomo
    Matsumoto, Naomichi
    JOURNAL OF HUMAN GENETICS, 2015, 60 (04) : 175 - 182
  • [13] Detection of clinically relevant copy-number variants by exome sequencing in a large cohort of genetic disorders
    Pfundt, Rolph
    del Rosario, Marisol
    Vissers, Lisenka E. L. M.
    Kwint, Michael P.
    Janssen, Irene M.
    de Leeuw, Nicole
    Yntema, Helger G.
    Nelen, Marcel R.
    Lugtenberg, Dorien
    Kamsteeg, Erik-Jan
    Wieskamp, Nienke
    Stegmann, Alexander P. A.
    Stevens, Servi J. C.
    Rodenburg, Richard J. T.
    Simons, Annet
    Mensenkamp, Arjen R.
    Rinne, Tuula
    Gilissen, Christian
    Scheffer, Hans
    Veltman, Joris A.
    Hehir-Kwa, Jayne Y.
    GENETICS IN MEDICINE, 2017, 19 (06) : 667 - 675
  • [14] PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data
    Wang, Chen
    Evans, Jared M.
    Bhagwate, Aditya V.
    Prodduturi, Naresh
    Sarangi, Vivekananda
    Middha, Mridu
    Sicotte, Hugues
    Vedell, Peter T.
    Hart, Steven N.
    Oliver, Gavin R.
    Kocher, Jean-Pierre A.
    Maurer, Matthew J.
    Novak, Anne J.
    Slager, Susan L.
    Cerhan, James R.
    Asmann, Yan W.
    BIOINFORMATICS, 2014, 30 (18) : 2678 - 2680
  • [15] Identification of copy number variants relevant to primary immunodeficiency from exome sequencing data
    Wan, Rensheng
    Schieck, Maximilian
    Hofmann, Winfried
    Knopf, Philipp H. B.
    Proietti, Michele
    de Oteyza, Andres Caballero Garcia
    Illig, Thomas
    Grimbacher, Bodo
    Steinemann, Doris
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 225 - 226
  • [16] FishingCNV: a graphical software package for detecting rare copy number variations in exome-sequencing data
    Shi, Yuhao
    Majewski, Jacek
    BIOINFORMATICS, 2013, 29 (11) : 1461 - 1462
  • [17] Author Correction: GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data
    Mehrtash Babadi
    Jack M. Fu
    Samuel K. Lee
    Andrey N. Smirnov
    Laura D. Gauthier
    Mark Walker
    David I. Benjamin
    Xuefang Zhao
    Konrad J. Karczewski
    Isaac Wong
    Ryan L. Collins
    Alba Sanchis-Juan
    Harrison Brand
    Eric Banks
    Michael E. Talkowski
    Nature Genetics, 2024, 56 : 553 - 553
  • [18] Platform comparison of detecting copy number variants with microarrays and whole-exome sequencing
    de Ligt, Joep
    Boone, Philip M.
    Pfundt, Rolph
    Vissers, Lisenka E. L. M.
    de Leeuw, Nicole
    Shaw, Christine
    Brunner, Han G.
    Lupski, James R.
    Veltman, Joris A.
    Hehir-Kwa, Jayne Y.
    GENOMICS DATA, 2014, 2 : 144 - 146
  • [19] Detecting copy-number alterations from single-cell chromatin sequencing data by AtaCNA
    Wang, Xiaochen
    Jin, Zijie
    Shi, Yang
    Xi, Ruibin
    CELL REPORTS METHODS, 2025, 5 (01):
  • [20] Copy-number variants in clinical genome sequencing: deployment and interpretation for rare and undiagnosed disease
    Gross, Andrew M.
    Ajay, Subramanian S.
    Rajan, Vani
    Brown, Carolyn
    Bluske, Krista
    Burns, Nicole J.
    Chawla, Aditi
    Coffey, Alison J.
    Malhotra, Alka
    Scocchia, Alicia
    Thorpe, Erin
    Dzidic, Natasa
    Hovanes, Karine
    Sahoo, Trilochan
    Dolzhenko, Egor
    Lajoie, Bryan
    Khouzam, Amirah
    Chowdhury, Shimul
    Belmont, John
    Roller, Eric
    Ivakhno, Sergii
    Tanner, Stephen
    McEachern, Julia
    Hambuch, Tina
    Eberle, Michael
    Hagelstrom, R. Tanner
    Bentley, David R.
    Perry, Denise L.
    Taft, Ryan J.
    GENETICS IN MEDICINE, 2019, 21 (05) : 1121 - 1130