EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing

被引:2
|
作者
Cosart, Ted [1 ]
Beja-Pereira, Albano [2 ]
Luikart, Gordon [3 ]
机构
[1] Univ Montana, Div Biol Sci, Missoula, MT 59812 USA
[2] Univ Porto, Ctr Invest Biodiversidade & Recursos Genet CIBIO, P-4485661 Vairao, Portugal
[3] Univ Montana, Div Biol Sci, Flathead Lake Biol Stn, Polson, MT 59860 USA
基金
美国国家科学基金会;
关键词
bioinformatics; exon capture; exon sequences; next-generation sequencing; CAPTURE; BLAST;
D O I
10.1111/1755-0998.12267
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The computer program exonsampler automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of exonsampler to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected similar to 10% of the exome (similar to 3 million bp), including 155 candidate genes, and similar to 16000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection.
引用
收藏
页码:1296 / 1301
页数:6
相关论文
共 50 条
  • [41] Detection of genome-wide structural variations in the Shanghai Holstein cattle population using next-generation sequencing
    Liu, Dengying
    Chen, Zhenliang
    Zhang, Zhe
    Sun, Hao
    Ma, Peipei
    Zhu, Kai
    Liu, Guanglei
    Wang, Qishan
    Pan, Yuchun
    ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES, 2019, 32 (03): : 320 - 333
  • [42] Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing
    Yi, Guoqiang
    Qu, Lujiang
    Liu, Jianfeng
    Yan, Yiyuan
    Xu, Guiyun
    Yang, Ning
    BMC GENOMICS, 2014, 15
  • [43] Development of genome-wide simple sequence repeat markers in Codonopsis lanceolata using next-generation sequencing
    Kim, Serim
    Jo, Namsu
    Gil, Jinsu
    Koo, Sung Cheol
    Um, Yurry
    Hong, Chang Pyo
    Park, Sin-Gi
    Kim, Ok Tae
    Kim, Seong-Cheol
    Kim, Ho Bang
    Lee, Dong Hoon
    Jeong, Byung-Hoon
    Lee, Yi
    HORTICULTURE ENVIRONMENT AND BIOTECHNOLOGY, 2021, 62 (06) : 985 - 993
  • [44] Genome-wide high-throughput integrome analyses by nrLAM-PCR and next-generation sequencing
    Paruzynski, Anna
    Arens, Anne
    Gabriel, Richard
    Bartholomae, Cynthia C.
    Scholz, Simone
    Wang, Wei
    Wolf, Stephan
    Glimm, Hanno
    Schmidt, Manfred
    von Kalle, Christof
    NATURE PROTOCOLS, 2010, 5 (08) : 1379 - 1395
  • [45] Complex genetics of pulmonary diseases: lessons from genome-wide association studies and next-generation sequencing
    Pouladi, Nima
    Bime, Christian
    Garcia, Joe G. N.
    Lussier, Yves A.
    TRANSLATIONAL RESEARCH, 2016, 168 : 22 - 39
  • [46] Genome-wide high-throughput integrome analyses by nrLAM-PCR and next-generation sequencing
    Anna Paruzynski
    Anne Arens
    Richard Gabriel
    Cynthia C Bartholomae
    Simone Scholz
    Wei Wang
    Stephan Wolf
    Hanno Glimm
    Manfred Schmidt
    Christof von Kalle
    Nature Protocols, 2010, 5 : 1379 - 1395
  • [47] A Second Update on Susceptibility Genes for Nicotine Dependence Identified by Genome-Wide Linkage, Candidate Gene Association, Genome-Wide Association, and Targeted Sequencing Approaches
    Li, Ming
    Yang, Jackie
    Payne, Thomas
    Ma, Jennie
    NEUROPSYCHOPHARMACOLOGY, 2016, 41 : S281 - S281
  • [48] Development of genome-wide SSR markers in rapeseed by next generation sequencing
    Zhu, Jifeng
    Zhang, Junying
    Jiang, Meiyan
    Wang, Weirong
    Jiang, Jianxia
    Li, Yanli
    Yang, Liyong
    Zhou, Xirong
    GENE, 2021, 798
  • [49] Next-Generation Genome-Wide Association Studies Time to Focus on Phenotype?
    MacRae, Calum A.
    Vasan, Ramachandran S.
    CIRCULATION-CARDIOVASCULAR GENETICS, 2011, 4 (04) : 334 - 336
  • [50] Copy Number Variant Detection by Targeted Gene Next-Generation Sequencing
    Myers, C. E.
    Nguyen, H. L.
    Hauenstein, J.
    Saxe, D. F.
    Smith, G. H.
    Hill, C. E.
    Zhang, L.
    Deeb, K. K.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2019, 21 (06): : 1150 - 1151