Models and information-theoretic bounds for nanopore sequencing

被引:0
|
作者
Mao, Wei
Diggavi, Suhas
Kannan, Sreeram
机构
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nanopore sequencing is an emerging new technology for sequencing DNA, which can read long fragments of DNA (similar to 50,000 bases) unlike most current sequencers which can only read hundreds of bases. While nanopore sequencers can acquire long reads, the high error rates (approximate to 30%) pose a technical challenge. In a nanopore sequencer, a DNA is migrated through a nanopore and current variations are measured. The DNA sequence is inferred from this observed current pattern using an algorithm called a base-caller. In this paper, we propose a mathematical model for the "channel" from the input DNA sequence to the observed current, and calculate bounds on the information extraction capacity of the nanopore sequencer. This model incorporates impairments like inter-symbol interference, deletions, as well as random response. The practical application of such information bounds is two-fold: (1) benchmarking present base-calling algorithms, and (2) offering an optimization objective for designing better nanopore sequencers.
引用
收藏
页码:2458 / 2462
页数:5
相关论文
共 50 条
  • [41] Decoding from Pooled Data: Sharp Information-Theoretic Bounds
    El Alaoui, Ahmed
    Ramdas, Aaditya
    Krzakala, Florent
    Zdeborova, Lenka
    Jordan, Michael, I
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01): : 161 - 188
  • [42] Information-Theoretic Lower Bounds to Error Probability for the Models of Noisy Discrete Source Coding and Object Classification
    M. M. Lange
    A. M. Lange
    Pattern Recognition and Image Analysis, 2022, 32 : 570 - 574
  • [43] Information-Theoretic Lower Bounds to Error Probability for the Models of Noisy Discrete Source Coding and Object Classification
    Lange, M. M.
    Lange, A. M.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (03) : 570 - 574
  • [44] Information-theoretic bounds on model selection for Gaussian Markov random fields
    Wang, Wei
    Wainwright, Martin J.
    Ramchandran, Kannan
    2010 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2010, : 1373 - 1377
  • [45] Information-theoretic bounds of evolutionary processes modeled as a protein communication system
    Gong, Liuling
    Bouaynaya, Nidhal
    Schonfeld, Dan
    2007 IEEE/SP 14TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 1 - 5
  • [46] Improved Information-Theoretic Generalization Bounds for Distributed, Federated, and Iterative Learning
    Barnes, Leighton Pate
    Dytso, Alex
    Poor, Harold Vincent
    ENTROPY, 2022, 24 (09)
  • [47] GENERIC BOUNDS ON THE MAXIMUM DEVIATIONS IN SEQUENTIAL PREDICTION: AN INFORMATION-THEORETIC ANALYSIS
    Fang, Song
    Zhu, Quanyan
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [48] REMARKS ON SOME STATISTICAL AND INFORMATION-THEORETIC MODELS FOR ESP
    CHARI, CTK
    JOURNAL OF THE AMERICAN SOCIETY FOR PSYCHICAL RESEARCH, 1966, 60 (02): : 164 - 175
  • [49] Information-Theoretic Lower Bounds on the Storage Cost of Shared Memory Emulation
    Cadambe, Viveck R.
    Wang, Zhiying
    Lynch, Nancy
    PROCEEDINGS OF THE 2016 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING (PODC'16), 2016, : 305 - 314
  • [50] Information-Theoretic Bounds on the Generalization Error and Privacy Leakage in Federated Learning
    Yagli, Semih
    Dytso, Alex
    Poor, H. Vincent
    PROCEEDINGS OF THE 21ST IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC2020), 2020,