Sequential Ensemble Learning for Outlier Detection: A Bias-Variance Perspective

被引:0
|
作者
Rayana, Shebuti [1 ]
Zhong, Wen [1 ]
Akoglu, Leman [2 ]
机构
[1] SUNY Stony Brook, Stony Brook, NY 11794 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICDM.2016.117
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble methods for classification have been effectively used for decades, while for outlier detection it has only been studied recently. In this work, we design a new ensemble approach for outlier detection in multi-dimensional point data, which provides improved accuracy by reducing error through both bias and variance by considering outlier detection as a binary classification task with unobserved labels. In this paper, we propose a sequential ensemble approach called CARE that employs a two-phase aggregation of the intermediate results in each iteration to reach the final outcome. Unlike existing outlier ensembles, our ensemble incorporates both the parallel and sequential building blocks to reduce bias as well as variance by (i) successively eliminating outliers from the original dataset to build a better data model on which outlierness is estimated (sequentially), and (ii) combining the results from individual base detectors and across iterations (parallelly). Through extensive experiments on 16 real-world datasets mainly from the UCI machine learning repository [1], we show that CARE performs significantly better than or at least similar to the individual baselines as well as the existing state-of-the-art outlier ensembles.
引用
收藏
页码:1167 / 1172
页数:6
相关论文
共 50 条
  • [21] Bias-variance analysis and ensembles of SVM
    Valentini, G
    Dietterich, TG
    MULTIPLE CLASSIFIER SYSTEMS, 2002, 2364 : 222 - 231
  • [22] Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods
    Valentini, Giorgio
    Dietterich, Thomas G.
    Journal of Machine Learning Research, 2004, 5 : 725 - 775
  • [23] On the stability and bias-variance analysis of sparse SVMs
    Saradhi, V. Vijaya
    Karnick, Harish
    NEUROCOMPUTING, 2008, 72 (1-3) : 659 - 663
  • [24] The bias-variance dilemma of the Monte Carlo method
    Mark, Z
    Baram, Y
    ARTIFICIAL NEURAL NETWORKS-ICANN 2001, PROCEEDINGS, 2001, 2130 : 141 - 147
  • [25] Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods
    Valentini, G
    Dietterich, TG
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 5 : 725 - 775
  • [26] Bias-Variance Tradeoffs in Recombination Rate Estimation
    Stone, Eric A.
    Singh, Nadia D.
    GENETICS, 2016, 202 (02) : 857 - 859
  • [27] On Bias-Variance Analysis for Probabilistic Logic Models
    Lodhi, Huma
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2008, 1 (03) : 27 - 40
  • [28] On the Properties of Bias-Variance Decomposition for kNN Regression
    Nedel'ko, Victor M.
    BULLETIN OF IRKUTSK STATE UNIVERSITY-SERIES MATHEMATICS, 2023, 43 : 110 - 121
  • [29] Controlling the Bias-Variance Tradeoff via Coherent Risk for Robust Learning with Kernels
    Koppel, Alec
    Bedi, Amrit S.
    Rajawat, Ketan
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 3519 - 3525
  • [30] A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000
    Peter van der Putten
    Maarten van Someren
    Machine Learning, 2004, 57 : 177 - 195