The use of the area under the roc curve in the evaluation of machine learning algorithms

被引:4663
|
作者
Bradley, AP [1 ]
机构
[1] UNIV QUEENSLAND,DEPT ELECT & COMP ENGN,COOPERAT RES CTR SENSOR SIGNAL & INFORMAT PROC,ST LUCIA,QLD 4072,AUSTRALIA
关键词
the ROC curve; the area under the ROC curve (AUC); accuracy measures; cross-validation; Wilcoxon statistic; standard error;
D O I
10.1016/S0031-3203(96)00142-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six ''real world'' medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for ''single number'' evaluation of machine learning algorithms. (C) 1997 Pattern Recognition Society.
引用
收藏
页码:1145 / 1159
页数:15
相关论文
共 50 条
  • [31] Estimation of the area under ROC curve with censored data
    Wang, Qihua
    Yao, Lili
    Lai, Peng
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (03) : 1033 - 1044
  • [32] Comment on 'The partial area under the summary ROC curve'
    Nadarajah, Saralees
    STATISTICS IN MEDICINE, 2008, 27 (14) : 2731 - 2734
  • [33] The area under the ROC curve as a measure of clustering quality
    Pablo A. Jaskowiak
    Ivan G. Costa
    Ricardo J. G. B. Campello
    Data Mining and Knowledge Discovery, 2022, 36 : 1219 - 1245
  • [34] Area Under the ROC Curve of Enhanced Energy Detector
    Khalid, Syed Safwan
    Abrar, Shafayat
    2013 11TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT), 2013, : 131 - 135
  • [35] Feature Selection for Maximizing the Area Under the ROC Curve
    Wang, Rui
    Tang, Ke
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 400 - 405
  • [36] Ranking Instances by Maximizing the Area under ROC Curve
    Guvenir, H. Altay
    Kurtcephe, Murat
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (10) : 2356 - 2366
  • [37] A boosting method for maximization of the area under the ROC curve
    Komori, Osamu
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2011, 63 (05) : 961 - 979
  • [38] Exact bootstrap variances of the area under ROC curve
    Bandos, Andriy I.
    Rockette, Howard E.
    Gur, David
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2007, 36 (13-16) : 2443 - 2461
  • [39] Exact Probability Distribution for the ROC Area under Curve
    Ekstrom, Joakim
    Akerren Ogren, Jim
    Sjoblom, Tobias
    CANCERS, 2023, 15 (06)
  • [40] Score Fusion by Maximizing the Area under the ROC Curve
    Villegas, Mauricio
    Paredes, Roberto
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PROCEEDINGS, 2009, 5524 : 473 - 480