Automated sequence preprocessing in a large-scale sequencing environment

被引:29
|
作者
Wendl, MC [1 ]
Dear, S
Hodgson, D
Hillier, L
机构
[1] Washington Univ, Genome Sequencing Ctr, St Louis, MO 63108 USA
[2] Sanger Ctr, Cambridge CB10 1SA, England
来源
GENOME RESEARCH | 1998年 / 8卷 / 09期
基金
英国惠康基金;
关键词
D O I
10.1101/gr.8.9.975
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A software system for transforming fragments from four-color Fluorescence-based gel electrophoresis experiments into assembled sequence is described. It has been developed for large-scale processing of all trace data, including shotgun and finishing reads, regardless of clone origin. Design considerations are discussed in detail, as are programming implementation and graphic tools. The importance of input validation, record tracking, and use of base quality values is emphasized. Several quality analysis metrics are proposed and applied to sample results from recently sequenced clones. Such quantities prove to be a valuable aid in evaluating modifications of sequencing protocol. The system is in full production use at both the Genome Sequencing Center and the Sanger Centre, for which combined weekly production is similar to 100,000 sequencing reads per week.
引用
收藏
页码:975 / 984
页数:10
相关论文
共 50 条
  • [21] The power of large-scale exome sequencing
    Linda Koch
    Nature Reviews Genetics, 2021, 22 : 549 - 549
  • [22] Large-scale EST sequencing in rice
    Kimiko Yamamoto
    Takuji Sasaki
    Plant Molecular Biology, 1997, 35 : 135 - 144
  • [23] An estimate of large-scale sequencing accuracy
    Hill, F
    Gemünd, C
    Benes, V
    Ansorge, W
    Gibson, TJ
    EMBO REPORTS, 2000, 1 (01) : 29 - 31
  • [24] The power of large-scale exome sequencing
    Koch, Linda
    NATURE REVIEWS GENETICS, 2021, 22 (09) : 549 - 549
  • [25] Automation in large-scale DNA sequencing
    Buxton, EC
    Westphall, M
    Jacobson, W
    Tong, XC
    Smith, LM
    LABORATORY ROBOTICS AND AUTOMATION, 1996, 8 (06) : 339 - 349
  • [26] AUTOMATED LOW-REDUNDANCY LARGE-SCALE DNA-SEQUENCING BY PRIMER WALKING
    VOSS, H
    WIEMANN, S
    GROTHUES, D
    SENSEN, C
    ZIMMERMANN, J
    SCHWAGER, C
    STEGEMANN, J
    ERFLE, H
    RUPP, T
    ANSORGE, W
    BIOTECHNIQUES, 1993, 15 (04) : 714 - +
  • [27] Large-Scale Automated Sleep Staging
    Sun, Haoqi
    Jia, Jian
    Goparaju, Balaji
    Huang, Guang-Bin
    Sourina, Olga
    Bianchi, Matt Travis
    Westover, M. Brandon
    SLEEP, 2017, 40 (10)
  • [28] The PREP pipeline: standardized preprocessing for large-scale EEG analysis
    Bigdely-Shamlo, Nima
    Mullen, Tim
    Kothe, Christian
    Su, Kyung-Min
    Robbins, Kay A.
    FRONTIERS IN NEUROINFORMATICS, 2015, 9 : 1 - 19
  • [29] Statistics of large-scale sequence searching
    Spang, R
    Vingron, M
    BIOINFORMATICS, 1998, 14 (03) : 279 - 284
  • [30] Neighborhood Preprocessing SVM for Large-scale Data Sets Classification
    Chen, Guangxi
    Xu, Jian
    Xiang, Xiaolin
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 245 - +