Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors

被引:4
|
作者
Gustafsson, Mats G. [1 ]
Waman, Mikael [1 ,2 ,3 ]
Bolin, Ulrika Wickenberg [1 ]
Goransson, Hanna [1 ]
Fryknas, M. [1 ]
Andersson, Claes R. [1 ]
Isaksson, Anders [1 ]
机构
[1] Uppsala Univ, Dept Med Sci, Acad Hosp, S-75185 Uppsala, Sweden
[2] Fraunhofer Chalmers Res Ctr Ind Math, SE-41288 Gothenburg, Sweden
[3] Univ Oxford, Comp Lab, Computat Biol Grp, Oxford OX1 3QD, England
基金
瑞典研究理事会;
关键词
Classifier design; Performance evaluation; Small sample learning; Decision support system; Diagnosis; Prognosis; LOGISTIC-REGRESSION; INFORMATION-THEORY; DECISION-SUPPORT; MICROARRAY DATA; PREDICTION; SELECTION; MODEL; VALIDATION; DIAGNOSIS; VARIANCE;
D O I
10.1016/j.artmed.2010.02.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (Cl) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the Cl is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice. Method and material: It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples. Results: Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets. Conclusions: An empirically derived ME prior seems promising for improving the Bayesian Cl for the unknown error rate of a designed classifier. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:93 / 104
页数:12
相关论文
共 50 条
  • [31] Empirical estimation of sequencing error rates using smoothing splines
    Zhu, Xuan
    Wang, Jian
    Peng, Bo
    Shete, Sanjay
    BMC BIOINFORMATICS, 2016, 17
  • [32] EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
    Andersen, John J.
    Nelson, Bradley J.
    Brown, Jeremy M.
    BMC BIOINFORMATICS, 2016, 17
  • [33] Maximum entropy method for calculating bit error rates of digital optical fibre communication system
    Shen, W.J.
    Sun, J.H.
    Singapore ICCS '90 - Conference Proceedings, 1990,
  • [34] EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
    John J. Andersen
    Bradley J. Nelson
    Jeremy M. Brown
    BMC Bioinformatics, 17
  • [35] Maximum entropy and least square error minimizing procedures for estimating missing conditional probabilities in Bayesian networks
    Pendharkar, Parag C.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (07) : 3583 - 3602
  • [36] The Prediction and Error Correction of Physiological Sign During Exercise Using Bayesian Combined Predictor and Naive Bayesian Classifier
    Zhang, Haibin
    Wen, Bo
    Liu, Jiajia
    Zeng, Yingming
    IEEE SYSTEMS JOURNAL, 2019, 13 (04): : 4410 - 4420
  • [37] Entropy-based air quality monitoring network optimization using NINP and Bayesian maximum entropy
    Ali Haddadi
    Mohammad Reza Nikoo
    Banafsheh Nematollahi
    Ghazi Al-Rawas
    Malik Al-Wardy
    Mehdi Toloo
    Amir H. Gandomi
    Environmental Science and Pollution Research, 2023, 30 : 84110 - 84125
  • [38] Shrinking effect in the Bayesian analysis of the GGE model using maximum entropy priori
    Oliveira, Luciano A.
    Silva, Carlos P.
    Silva, Alessandra Q.
    Mendes, Cristian T. E.
    Nuvunga, Joel J.
    Bueno Filho, Julio S. S.
    SIGMAE, 2023, 12 (01): : 158 - 171
  • [39] Entropy-based air quality monitoring network optimization using NINP and Bayesian maximum entropy
    Haddadi, Ali
    Nikoo, Mohammad Reza
    Nematollahi, Banafsheh
    Al-Rawas, Ghazi
    Al-Wardy, Malik
    Toloo, Mehdi
    Gandomi, Amir H.
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2023, 30 (35) : 84110 - 84125
  • [40] Improving the stability of bivariate correlations using informative Bayesian priors: a Monte Carlo simulation study
    Delfin, Carl
    FRONTIERS IN PSYCHOLOGY, 2023, 14