EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing

被引:2
|
作者
Cosart, Ted [1 ]
Beja-Pereira, Albano [2 ]
Luikart, Gordon [3 ]
机构
[1] Univ Montana, Div Biol Sci, Missoula, MT 59812 USA
[2] Univ Porto, Ctr Invest Biodiversidade & Recursos Genet CIBIO, P-4485661 Vairao, Portugal
[3] Univ Montana, Div Biol Sci, Flathead Lake Biol Stn, Polson, MT 59860 USA
基金
美国国家科学基金会;
关键词
bioinformatics; exon capture; exon sequences; next-generation sequencing; CAPTURE; BLAST;
D O I
10.1111/1755-0998.12267
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The computer program exonsampler automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of exonsampler to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected similar to 10% of the exome (similar to 3 million bp), including 155 candidate genes, and similar to 16000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection.
引用
收藏
页码:1296 / 1301
页数:6
相关论文
共 50 条
  • [31] Genome-wide small nucleolar RNA expression analysis of lung cancer by next-generation deep sequencing
    Gao, Lu
    Ma, Jie
    Mannoor, Kaiissar
    Guarnera, Maria A.
    Shetty, Amol
    Zhan, Min
    Xing, Lingxiao
    Stass, Sanford A.
    Jiang, Feng
    INTERNATIONAL JOURNAL OF CANCER, 2015, 136 (06) : E623 - E629
  • [32] Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq
    Marchal, Claire
    Sasaki, Takayo
    Vera, Daniel
    Wilson, Korey
    Sima, Jiao
    Rivera-Mulia, Juan Carlos
    Trevilla-Garcia, Claudia
    Nogues, Coralin
    Nafie, Ebtesam
    Gilbert, David M.
    NATURE PROTOCOLS, 2018, 13 (05) : 819 - 839
  • [33] Development of Genome-Wide Insertion and Deletion Polymorphism Markers from Next-Generation Sequencing Data in Rice
    Jian Liu
    Jingwei Li
    Jingtao Qu
    Shuangyong Yan
    Rice, 2015, 8
  • [34] Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing
    Guoqiang Yi
    Lujiang Qu
    Jianfeng Liu
    Yiyuan Yan
    Guiyun Xu
    Ning Yang
    BMC Genomics, 15
  • [35] Development of Genome-Wide Insertion and Deletion Polymorphism Markers from Next-Generation Sequencing Data in Rice
    Liu, Jian
    Li, Jingwei
    Qu, Jingtao
    Yan, Shuangyong
    RICE, 2015, 8
  • [36] Genome-wide retinal transcriptome analysis of endotoxin-induced uveitis in mice with next-generation sequencing
    Qiu, Yiguo
    Yu, Peng
    Lin, Ru
    Fu, Xinyu
    Hao, Bingtao
    Lei, Bo
    MOLECULAR VISION, 2017, 23 : 395 - 406
  • [37] Genome-wide DNA methylation maps in chronic lymphocytic leukemia cells determined by next-generation sequencing
    Pei, Lirong
    Choi, Jeong-Hyeon
    Liu, Jimei
    Arthur, Gerald
    Schnabel, Jennifer L.
    Taylor, Kristen H.
    Caldwell, Charles W.
    Wang, Xinguo
    Shi, Huidong
    CANCER RESEARCH, 2011, 71
  • [38] Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq
    Claire Marchal
    Takayo Sasaki
    Daniel Vera
    Korey Wilson
    Jiao Sima
    Juan Carlos Rivera-Mulia
    Claudia Trevilla-García
    Coralin Nogues
    Ebtesam Nafie
    David M Gilbert
    Nature Protocols, 2018, 13 : 819 - 839
  • [39] Development of genome-wide simple sequence repeat markers in Codonopsis lanceolata using next-generation sequencing
    Serim Kim
    Namsu Jo
    Jinsu Gil
    Sung Cheol Koo
    Yurry Um
    Chang Pyo Hong
    Sin-Gi Park
    Ok Tae Kim
    Seong-Cheol Kim
    Ho Bang Kim
    Dong Hoon Lee
    Byung-Hoon Jeong
    Yi Lee
    Horticulture, Environment, and Biotechnology, 2021, 62 : 985 - 993
  • [40] From Genome-Wide Association Studies to Next-Generation Sequencing Lessons From the Past and Planning for the Future
    Sharma, Manu
    Krueger, Rejko
    Gasser, Thomas
    JAMA NEUROLOGY, 2014, 71 (01) : 5 - 6