SAM: String-based sequence search algorithm for mitochondrial DNA database queries

被引:20
|
作者
Roeck, Alexander [2 ]
Irwin, Jodi [3 ]
Duer, Arne [2 ]
Parsons, Thomas [4 ]
Parson, Walther [1 ]
机构
[1] Innsbruck Med Univ, Inst Legal Med, A-6020 Innsbruck, Austria
[2] Univ Innsbruck, Inst Math, A-6020 Innsbruck, Austria
[3] Armed Forces DNA Identificat Lab, Rockville, MD 20850 USA
[4] Int Commiss Missing Persons, Sarajevo 71000, Bosnia & Herceg
基金
奥地利科学基金会;
关键词
mtDNA databases; Phylogenetic; Alignment; Sequences; EMPOP; CONTROL REGION SEQUENCES; CONSISTENT TREATMENT; LENGTH VARIANTS;
D O I
10.1016/j.fsigen.2010.10.006
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). (C) 2010 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:126 / 132
页数:7
相关论文
共 50 条
  • [1] An efficient collaborative editing algorithm supporting string-based operations
    Lv, Xiao
    He, Fazhi
    Cai, Weiwei
    Cheng, Yuan
    2016 IEEE 20TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2016, : 45 - 50
  • [2] On a heterotic string-based algorithm for the determination of the fine structure constant
    El Naschie, MS
    CHAOS SOLITONS & FRACTALS, 2001, 12 (03) : 539 - 549
  • [3] Fingerprint Matching Algorithm Using String-Based MHC Detector Set
    Jeong, Jae-Won
    Jang, In-Noon
    Sim, Kwee-Bo
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2005, 9 (02) : 175 - 180
  • [4] Basic Sequence Search by Hashing Algorithm in DNA Sequence Databases
    Toh, Sing-Hui
    Lee, Hoon-Jae
    Do, Kyeong-Hoon
    11TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS I-III, PROCEEDINGS,: UBIQUITOUS ICT CONVERGENCE MAKES LIFE BETTER!, 2009, : 2317 - 2320
  • [5] Algorithm based on RGH-tree for similarity search queries
    Zhang, Zhao-Gong
    Li, Jian-Zhong
    1969, Chinese Academy of Sciences (13):
  • [6] Erroneous claims about the impact of mitochondrial DNA sequence database errors
    Helgason, A
    Stefánsson, K
    AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (04) : 974 - 975
  • [7] Fast semi-local alignment for DNA sequence database search
    Chen, YS
    Hung, YP
    Fuh, CS
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL III, PROCEEDINGS, 2002, : 1019 - 1022
  • [8] Regular polygon based search algorithm for processing maximum range queries
    Sato, Hideki
    Narita, Ryoichi
    Smart Innovation, Systems and Technologies, 2015, 30 : 99 - 114
  • [9] A Pivotal Prefix Based Filtering Algorithm for String Similarity Search
    Deng, Dong
    Li, Guoliang
    Feng, Jianhua
    SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 673 - 684
  • [10] Amerindian mitochondrial DNA haplogroups predominate in the population of Argentina: towards a first nationwide forensic mitochondrial DNA sequence database
    Cecilia Bobillo, Maria
    Zimmermann, Bettina
    Sala, Andrea
    Huber, Gabriela
    Roeck, Alexander
    Bandelt, Hans-Juergen
    Corach, Daniel
    Parson, Walther
    INTERNATIONAL JOURNAL OF LEGAL MEDICINE, 2010, 124 (04) : 263 - 268