What Do We Gain When Tolerating Loss? The Information Bottleneck Wrings Out Recombination

被引:0
|
作者
Narechania, Apurva [1 ,2 ]
Bobo, Dean [1 ,3 ]
Desalle, Rob [1 ]
Mathema, Barun [4 ]
Kreiswirth, Barry [5 ]
Planet, Paul J. [1 ,6 ,7 ]
机构
[1] Amer Museum Nat Hist, Inst Comparat Genom, New York, NY 10024 USA
[2] Univ Copenhagen, Globe Inst, Sect Hologen, Copenhagen, Denmark
[3] Columbia Univ, Dept Ecol Evolut & Environm Biol, New York, NY USA
[4] Columbia Univ, Mailman Sch Publ Hlth, Dept Epidemiol, New York, NY USA
[5] Hackensack Meridian Hlth, Ctr Discovery & Innovat, Nutley, NJ USA
[6] Childrens Hosp Philadelphia, Div Infect Dis, Philadelphia, PA 19104 USA
[7] Univ Penn, Perelman Sch Med, Dept Pediat, Philadelphia, PA 19104 USA
关键词
microbial evolution; recombination; information theory; STAPHYLOCOCCUS-AUREUS; GENOME; ALIGNMENT; DIVERGENCE; SEQUENCE; USA300; TOOL;
D O I
10.1093/molbev/msaf029
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Most microbes have the capacity to acquire genetic material from their environment. Recombination of foreign DNA yields genomes that are, at least in part, incongruent with the vertical history of their species. Dominant approaches for detecting these transfers are phylogenetic, requiring a painstaking series of analyses including alignment and tree reconstruction. But these methods do not scale. Here, we propose an unsupervised, alignment-free, and tree-free technique based on the sequential information bottleneck, an optimization procedure designed to extract some portion of relevant information from 1 random variable conditioned on another. In our case, this joint probability distribution tabulates occurrence counts of k-mers against their genomes of origin with the expectation that recombination will create a strong signal that unifies certain sets of co-occurring k-mers. We conceptualize the technique as a rate-distortion problem, measuring distortion in the relevance information as k-mers are compressed into clusters based on their co-occurrence in the source genomes. The result is fast, model-free, lossy compression of k-mers into learned groups of shared genome sequence, differentiating recombined elements from the vertically inherited core. We show that the technique yields a new recombination measure based purely on information, divorced from any biases and limitations inherent to alignment and phylogeny.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] What do we taste, when we taste?
    Hofmann, Thomas
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2010, 240
  • [22] What Do We Want When We Work
    Klikauer, Thomas
    CAPITAL AND CLASS, 2019, 43 (02): : 367 - 370
  • [23] What do we translate when we translate?
    Martinez, Jorge
    TEORIA-RIVISTA DI FILOSOFIA, 2020, 40 (02): : 35 - 47
  • [24] What we do when we suppress fever
    Mackowiak P.A.
    Current Infectious Disease Reports, 2005, 7 (1) : 1 - 4
  • [25] What Do We Aim At When We Believe?
    McHugh, Conor
    DIALECTICA, 2011, 65 (03) : 369 - 392
  • [26] What Do We Do When We Teach Software Engineering?
    Maguire, Joseph
    Draper, Steve
    Cutts, Quintin
    PROCEEDINGS OF THE 2019 CONFERENCE ON UNITED KINGDOM & IRELAND COMPUTING EDUCATION RESEARCH, UKICER 2019, 2015,
  • [27] What do we do when we talk about Vulnerability?
    Tedesco, Solange
    Liberman, Flavia
    MUNDO DA SAUDE, 2008, 32 (02): : 254 - 260
  • [28] What we do when we do law and popular culture
    Silbey, JM
    LAW AND SOCIAL INQUIRY-JOURNAL OF THE AMERICAN BAR FOUNDATION, 2002, 27 (01): : 139 - 168
  • [29] What do we do when the bucket is empty?
    Harrington, Glenda
    INTERNATIONAL JOURNAL OF MENTAL HEALTH NURSING, 2023, 32 : 21 - 22
  • [30] Do We Lose on the Swings What We Gain on the Roundabouts?
    Nordanstig, Joakim
    Menard, Matthew T.
    EUROPEAN JOURNAL OF VASCULAR AND ENDOVASCULAR SURGERY, 2024, 68 (01) : 108 - 109