The effects of sampling on delimiting species from multi-locus sequence data

被引:40
|
作者
Rittmeyer, Eric N. [1 ]
Austin, Christopher C. [1 ]
机构
[1] Louisiana State Univ, Museum Nat Sci, Dept Biol Sci, Baton Rouge, LA 70803 USA
基金
美国国家科学基金会;
关键词
Species delimitation; Sampling strategy; Structurama; Nonparametric delimitation; Gaussian clustering; POPULATION-STRUCTURE; MAXIMUM-LIKELIHOOD; BAYESIAN-INFERENCE; TREE ESTIMATION; GENE TREES; DELIMITATION; CONSEQUENCES; TAXONOMY; DIVERGENCE; SIMULATION;
D O I
10.1016/j.ympev.2012.06.031
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As a fundamental unit in biology, species are used in a wide variety of studies, and their delimitation impacts every subfield of the life sciences. Thus, it is of utmost importance that species are delimited in an accurate and biologically meaningful way. However, due to morphologically similar, cryptic species, and processes such as incomplete lineage sorting, this is far from a trivial task. Here, we examine the accuracy and sensitivity to sampling strategy of three recently developed methods that aim to delimit species from multi-locus DNA sequence data without a priori assignments of samples to putative species. Specifically, we simulate data at two species tree depths and a variety of sampling strategies ranging from five alleles per species and five loci to 20 alleles per species and 100 loci to test (1) Structurama, (2) Gaussian clustering, and (3) nonparametric delimitation. We find that Structurama accurately delimits even relatively recently diverged (greater than 1.5 N generations) species when sampling 10 or more loci. We also find that Gaussian clustering delimits more deeply divergent species (greater than 2.5 N generations) relatively well, but is not sufficiently sensitive to delimit more recently diverged species. Finally, we find that nonparametric delimitation performs well with 25 or more loci if gene trees are known without error, but performs poorly with estimated gene genealogies, frequently over-splitting species and mis-assigning samples. We thus suggest that Structurama represents a powerful tool for use in species delimitation. It should be noted, however, that intraspecific population structure may be delimited using this or any of the methods tested herein. We argue that other methods, such as other species delimitation methods requiring a priori putative species assignments (e.g. SpeDeSTEM, Bayesian species delimitation), and other types of data (e.g. morphological, ecological, behavioral) be incorporated in conjunction with these methods in studies attempting to delimit species. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:451 / 463
页数:13
相关论文
共 50 条
  • [21] Multi-locus Sequence Analysis (MLSA) of Edwardsiella tarda isolates from fish
    Abayneh, T.
    Colquhoun, D. J.
    Sorum, H.
    VETERINARY MICROBIOLOGY, 2012, 158 (3-4) : 367 - 375
  • [22] Inferring HIV escape rates from multi-locus genotype data
    Kessinger, Taylor A.
    Perelson, Alan S.
    Neher, Richard A.
    FRONTIERS IN IMMUNOLOGY, 2013, 4
  • [23] Global Analyses of Multi-Locus Sequence Typing Data Reveal Geographic Differentiation, Hybridization, and Recombination in the Cryptococcus gattii Species Complex
    Hitchcock, Megan
    Xu, Jianping
    JOURNAL OF FUNGI, 2023, 9 (02)
  • [24] Dependency effects in multi-locus match probabilities
    Laurie, C
    Weir, BS
    THEORETICAL POPULATION BIOLOGY, 2003, 63 (03) : 207 - 219
  • [25] The core genome multi-locus sequence typing of Mycoplasma anserisalpingitidis
    Kovacs, Aron Botond
    Kreizinger, Zsuzsa
    Forro, Barbara
    Grozner, Denes
    Mitter, Alexa
    Marton, Szilvia
    Bali, Krisztina
    Sawicka, Anna
    Tomczyk, Grzegorz
    Banyai, Krisztian
    Gyuranecz, Miklos
    MAGYAR ALLATORVOSOK LAPJA, 2023, 145 (09) : 527 - 534
  • [26] Rapid Multi-Locus Sequence Typing Using Microfluidic Biochips
    Read, Timothy D.
    Turingan, Rosemary S.
    Cook, Christopher
    Giese, Heidi
    Thomann, Ulrich Hans
    Hogan, Catherine C.
    Tan, Eugene
    Selden, Richard F.
    PLOS ONE, 2010, 5 (05):
  • [27] The core genome multi-locus sequence typing of Mycoplasma anserisalpingitidis
    Áron B. Kovács
    Zsuzsa Kreizinger
    Barbara Forró
    Dénes Grózner
    Alexa Mitter
    Szilvia Marton
    Krisztina Bali
    Anna Sawicka
    Grzegorz Tomczyk
    Krisztián Bányai
    Miklós Gyuranecz
    BMC Genomics, 21
  • [28] The core genome multi-locus sequence typing ofMycoplasma anserisalpingitidis
    Kovacs, Aron B.
    Kreizinger, Zsuzsa
    Forro, Barbara
    Grozner, Denes
    Mitter, Alexa
    Marton, Szilvia
    Bali, Krisztina
    Sawicka, Anna
    Tomczyk, Grzegorz
    Banyai, Krisztian
    Gyuranecz, Miklos
    BMC GENOMICS, 2020, 21 (01)
  • [29] mlstdbNet – distributed multi-locus sequence typing (MLST) databases
    Keith A Jolley
    Man-Suen Chan
    Martin CJ Maiden
    BMC Bioinformatics, 5
  • [30] mlstdbNet - distributed multi-locus sequence typing (MLST) databases
    Jolley, KA
    Chan, MS
    Maiden, MCJ
    BMC BIOINFORMATICS, 2004, 5 (1)