Non-parametric and semi-parametric support estimation using SEquential RESampling random walks on biomolecular sequences

被引:1
|
作者
Wang, Wei [1 ]
Smith, Jack [1 ]
Hejase, Hussein A. [2 ]
Liu, Kevin J. [1 ]
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[2] Cold Spring Harbor Lab, Simons Ctr Quantitat Biol, POB 100, Cold Spring Harbor, NY 11724 USA
基金
美国国家科学基金会;
关键词
Statistical support; Non-parametric; Semi-parametric; Resampling; Bootstrap; Multiple sequence alignment; Random walk; MULTIPLE; RELIABILITY; ALIGNMENTS;
D O I
10.1186/s13015-020-00167-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Non-parametric and semi-parametric resampling procedures are widely used to perform support estimation in computational biology and bioinformatics. Among the most widely used methods in this class is the standard bootstrap method, which consists of random sampling with replacement. While not requiring assumptions about any particular parametric model for resampling purposes, the bootstrap and related techniques assume that sites are independent and identically distributed (i.i.d.). The i.i.d. assumption can be an over-simplification for many problems in computational biology and bioinformatics. In particular, sequential dependence within biomolecular sequences is often an essential biological feature due to biochemical function, evolutionary processes such as recombination, and other factors. To relax the simplifying i.i.d. assumption, we propose a new non-parametric/semi-parametric sequential resampling technique that generalizes "Heads-or-Tails" mirrored inputs, a simple but clever technique due to Landan and Graur. The generalized procedure takes the form of random walks along either aligned or unaligned biomolecular sequences. We refer to our new method as the SERES (or "SEquential RESampling") method. To demonstrate the performance of the new technique, we apply SERES to estimate support for the multiple sequence alignment problem. Using simulated and empirical data, we show that SERES-based support estimation yields comparable or typically better performance compared to state-of-the-art methods.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Non- and semi-parametric estimation in models with unknown smoothness
    Kotlyarova, Yulia
    Zinde-Walsh, Victoria
    ECONOMICS LETTERS, 2006, 93 (03) : 379 - 386
  • [32] Non-Parametric Stochastic Sequential Assignment With Random Arrival Times
    Dervovic, Danial
    Hassanzadeh, Parisa
    Assefa, Samuel
    Reddy, Prashant
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4214 - 4220
  • [33] How to analyse seed germination data using statistical time-to-event analysis: non-parametric and semi-parametric methods
    McNair, James N.
    Sunkara, Anusha
    Frobish, Daniel
    SEED SCIENCE RESEARCH, 2012, 22 (02) : 77 - 95
  • [34] Support vector machine classification using semi-parametric model
    Akbari, Mohammad Ghassem
    Khorashadizadeh, Saeed
    Majidi, Mohammad-Hassan
    SOFT COMPUTING, 2022, 26 (19) : 10049 - 10062
  • [35] Support vector machine classification using semi-parametric model
    Mohammad Ghassem Akbari
    Saeed Khorashadizadeh
    Mohammad-Hassan Majidi
    Soft Computing, 2022, 26 : 10049 - 10062
  • [36] A Refined Non-parametric Algorithm for Sequential Software Reliability Estimation
    Mizoguchi, Shintaro
    Dohi, Tadashi
    ADVANCES IN SOFTWARE ENGINEERING, PROCEEDINGS, 2009, 59 : 330 - 337
  • [37] Non-parametric significance estimation of joint-spike events by shuffling and resampling
    Pipa, G
    Grün, S
    NEUROCOMPUTING, 2003, 52-4 : 31 - 37
  • [38] Resampling for checking linear regression models via non-parametric regression estimation
    Fernández, JMV
    Manteiga, WG
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2000, 35 (02) : 211 - 231
  • [39] Semi-parametric and non-parametric methods for clinical trials with incomplete data (vol 24, pg 341, 2005)
    O'Brien, PC
    STATISTICS IN MEDICINE, 2005, 24 (21) : 3385 - 3385
  • [40] Non-Parametric Estimation of the Renewal Function for Multidimensional Random Fields
    Andriamampionona, Livasoa
    Harison, Victor
    Harel, Michel
    MATHEMATICS, 2024, 12 (12)