Fast One-class Classification using Class Boundary-preserving Random Projections

被引:4
|
作者
Bhattacharya, Arindam [1 ]
Varambally, Sumanth [2 ]
Bagchi, Amitabha [1 ]
Bedathur, Srikanta [1 ]
机构
[1] IIT Delhi, Dept Comp Sci, Delhi, India
[2] IIT Delhi, Dept Math, Delhi, India
来源
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2021年
关键词
one class classification; ensemble classifier; random projection; kernel based method;
D O I
10.1145/3447548.3467440
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several applications, like malicious URL detection and web spam detection, require classification on very high-dimensional data. In such cases anomalous data is hard to find but normal data is easily available. As such it is increasingly common to use a one-class classifier (OCC). Unfortunately, most OCC algorithms cannot scale to datasets with extremely high dimensions. In this paper, we present Fast Random projection-based One-Class Classification (FROCC), an extremely efficient, scalable and easily parallelizable method for one-class classification with provable theoretical guarantees. Our method is based on the simple idea of transforming the training data by projecting it onto a set of random unit vectors that are chosen uniformly and independently from the unit sphere, and bounding the regions based on separation of the data. FROCC can be naturally extended with kernels. We provide a new theoretical framework to prove that that FROCC generalizes well in the sense that it is stable and has low bias for some parameter settings. We then develop a fast scalable approximation of FROCC using vectorization, exploiting data sparsity and parallelism to develop a new implementation called ParDFROCC. ParDFROCC achieves up to 2 percent points better ROC than the next best baseline, with up to 12x speedup in training and test times over a range of state-of-the-art benchmarks for the OCC task.
引用
收藏
页码:66 / 74
页数:9
相关论文
共 50 条
  • [31] Feature extraction for one-class classification
    Tax, DMJ
    Müller, KR
    ARTIFICAIL NEURAL NETWORKS AND NEURAL INFORMATION PROCESSING - ICAN/ICONIP 2003, 2003, 2714 : 342 - 349
  • [32] One-class classification with Gaussian processes
    Kemmler, Michael
    Rodner, Erik
    Wacker, Esther-Sabrina
    Denzler, Joachim
    PATTERN RECOGNITION, 2013, 46 (12) : 3507 - 3518
  • [33] Kernel whitening for one-class classification
    Tax, DMJ
    Juszczak, P
    PATTERN RECOGNITION WITH SUPPORT VECTOR MACHINES, PROCEEDINGS, 2002, 2388 : 40 - +
  • [34] Instance reduction for one-class classification
    Krawczyk, Bartosz
    Triguero, Isaac
    Garcia, Salvador
    Wozniak, Michal
    Herrera, Francisco
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 59 (03) : 601 - 628
  • [35] Kernel whitening for one-class classification
    Tax, DMJ
    Juszczak, P
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2003, 17 (03) : 333 - 347
  • [36] One-Class Classification with Gaussian Processes
    Kemmler, Michael
    Rodner, Erik
    Denzler, Joachim
    COMPUTER VISION - ACCV 2010, PT II, 2011, 6493 : 489 - 500
  • [37] One-class SVMs for document classification
    Manevitz, LM
    Yousef, M
    JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) : 139 - 154
  • [38] Dynamic hypersphere SVDD without describing boundary for one-class classification
    Jianlin Wang
    Weimin Liu
    Kepeng Qiu
    Huan Xiong
    Liqiang Zhao
    Neural Computing and Applications, 2019, 31 : 3295 - 3305
  • [39] Dynamic hypersphere SVDD without describing boundary for one-class classification
    Wang, Jianlin
    Liu, Weimin
    Qiu, Kepeng
    Xiong, Huan
    Zhao, Liqiang
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (08): : 3295 - 3305
  • [40] Dissimilarity-Preserving Representation Learning for One-Class Time Series Classification
    Mauceri, Stefano
    Sweeney, James
    Nicolau, Miguel
    McDermott, James
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13951 - 13962