Optimal Bayesian Transfer Learning for Count Data

被引:2
|
作者
Karbalayghareh, Alireza [1 ]
Qian, Xiaoning [1 ]
Dougherty, Edward R. [1 ]
机构
[1] Texas A&M Univ, Dept Elect & Comp Engn, College Stn, TX 77843 USA
基金
美国国家科学基金会;
关键词
Bayes methods; Cancer; Bioinformatics; Shape; Genomics; Data models; Optimal Bayesian transfer learning; optimal Bayesian classification; transfer learning; DIFFERENTIAL EXPRESSION ANALYSIS; MINIMUM EXPECTED ERROR; OPTIMAL CLASSIFIERS; CLASSIFICATION; FRAMEWORK; DISCRETE;
D O I
10.1109/TCBB.2019.2920981
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
There is often a limited amount of omics data to design predictive models in biomedicine. Knowing that these omics data come from underlying processes that may share common pathways and disease mechanisms, it may be beneficial for designing a more accurate and reliable predictor in a target domain of interest, where there is a lack of labeled data to leverage available data in relevant source domains. Here, we focus on developing Bayesian transfer learning methods for analyzing next-generation sequencing (NGS) data to help improve predictions in the target domain. We formulate transfer learning in a fully Bayesian framework and define the relatedness by a joint prior distribution of the model parameters of the source and target domains. Defining joint priors acts as a bridge across domains, through which the related knowledge of source data is transferred to the target domain. We focus on RNA-seq discrete count data, which are often overdispersed. To appropriately model them, we consider the Negative Binomial model and propose an Optimal Bayesian Transfer Learning (OBTL) classifier that minimizes the expected classification error in the target domain. We evaluate the performance of the OBTL classifier via both synthetic and cancer data from The Cancer Genome Atlas (TCGA).
引用
收藏
页码:644 / 655
页数:12
相关论文
共 50 条
  • [41] A Bayesian approach for simultaneous segmentation and classification of count data
    Cappé, O
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2002, 50 (02) : 400 - 410
  • [42] Bayesian Count Data Modeling for Finding Technological Sustainability
    Jun, Sunghae
    SUSTAINABILITY, 2018, 10 (09)
  • [43] A Bayesian Approach to Account for Misclassification and Overdispersion in Count Data
    Wu, Wenqi
    Stamey, James
    Kahle, David
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2015, 12 (09): : 10648 - 10661
  • [44] Bayesian epidemic models for spatially aggregated count data
    Malesios, Chrisovalantis
    Demiris, Nikolaos
    Kalogeropoulos, Konstantinos
    Ntzoufras, Ioannis
    STATISTICS IN MEDICINE, 2017, 36 (20) : 3216 - 3230
  • [45] A Bayesian nonparametric approach to correct for underreporting in count data
    Arima, Serena
    Polettini, Silvia
    Pasculli, Giuseppe
    Gesualdo, Loreto
    Pesce, Francesco
    Procaccini, Deni-Aldo
    BIOSTATISTICS, 2023, 25 (03) : 904 - 918
  • [46] Bayesian variable selection for time series count data
    Ibrahim, JG
    Chen, MH
    Ryan, LM
    STATISTICA SINICA, 2000, 10 (03) : 971 - 987
  • [47] Bayesian quantile regression model for claim count data
    Fuzi, Mohd Fadzli Mohd
    Jemain, Abdul Aziz
    Ismail, Noriszura
    INSURANCE MATHEMATICS & ECONOMICS, 2016, 66 : 124 - 137
  • [48] Bayesian spatial modelling of gamma ray count data
    Leonte, D
    Nott, DJ
    MATHEMATICAL GEOLOGY, 2006, 38 (02): : 135 - 154
  • [49] A Bayesian approach to analyse overdispersed longitudinal count data
    Rizzato, Fernanda B.
    Leandro, Roseli A.
    Demetrio, Clarice G. B.
    Molenberghs, Geert
    JOURNAL OF APPLIED STATISTICS, 2016, 43 (11) : 2085 - 2109
  • [50] Optimal online learning: a Bayesian approach
    Solla, SA
    Winther, O
    COMPUTER PHYSICS COMMUNICATIONS, 1999, 121 : 94 - 97