Fast One-class Classification using Class Boundary-preserving Random Projections

被引:4
|
作者
Bhattacharya, Arindam [1 ]
Varambally, Sumanth [2 ]
Bagchi, Amitabha [1 ]
Bedathur, Srikanta [1 ]
机构
[1] IIT Delhi, Dept Comp Sci, Delhi, India
[2] IIT Delhi, Dept Math, Delhi, India
来源
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2021年
关键词
one class classification; ensemble classifier; random projection; kernel based method;
D O I
10.1145/3447548.3467440
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several applications, like malicious URL detection and web spam detection, require classification on very high-dimensional data. In such cases anomalous data is hard to find but normal data is easily available. As such it is increasingly common to use a one-class classifier (OCC). Unfortunately, most OCC algorithms cannot scale to datasets with extremely high dimensions. In this paper, we present Fast Random projection-based One-Class Classification (FROCC), an extremely efficient, scalable and easily parallelizable method for one-class classification with provable theoretical guarantees. Our method is based on the simple idea of transforming the training data by projecting it onto a set of random unit vectors that are chosen uniformly and independently from the unit sphere, and bounding the regions based on separation of the data. FROCC can be naturally extended with kernels. We provide a new theoretical framework to prove that that FROCC generalizes well in the sense that it is stable and has low bias for some parameter settings. We then develop a fast scalable approximation of FROCC using vectorization, exploiting data sparsity and parallelism to develop a new implementation called ParDFROCC. ParDFROCC achieves up to 2 percent points better ROC than the next best baseline, with up to 12x speedup in training and test times over a range of state-of-the-art benchmarks for the OCC task.
引用
收藏
页码:66 / 74
页数:9
相关论文
共 50 条
  • [41] One-Class Classification by Combining Density and Class Probability Estimation
    Hempstalk, Kathryn
    Frank, Eibe
    Witten, Ian H.
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 505 - 519
  • [42] One-Class Classification of Mammograms Using Trace Transform Functionals
    Ganesan, Karthikeyan
    Acharya, U. Rajendra
    Chua, Chua Kuang
    Lim, Choo Min
    Abraham, K. Thomas
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2014, 63 (02) : 304 - 311
  • [43] Malware Detection for Internet of Things Using One-Class Classification
    Shi, Tongxin
    McCann, Roy A.
    Huang, Ying
    Wang, Wei
    Kong, Jun
    SENSORS, 2024, 24 (13)
  • [44] CONTINUAL LEARNING THROUGH ONE-CLASS CLASSIFICATION USING VAE
    Wiewel, Felix
    Brendle, Andreas
    Yang, Bin
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3307 - 3311
  • [45] Fuzzy One-Class Classification Model Using Contamination Neighborhoods
    Utkin, Lev V.
    ADVANCES IN FUZZY SYSTEMS, 2012, 2012
  • [46] INTRUSION DETECTION IN SCADA SYSTEMS USING ONE-CLASS CLASSIFICATION
    Nader, Patric
    Honeine, Paul
    Beauseroy, Pierre
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [47] Anomaly Detection using Clustered Deep One-Class Classification
    Kim, Younghwan
    Kim, Huy Kang
    2020 15TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY (ASIAJCIS 2020), 2020, : 151 - 157
  • [48] Steganography anomaly detection using simple one-class classification
    Rodriguez, Benjamin M.
    Peterson, Gilbert L.
    Agaian, Sos S.
    MOBILE MULTIMEDIA/IMAGE PROCESSING FOR MILITARY AND SECURITY APPLICATIONS 2007, 2007, 6579
  • [49] Locality Preserving One-Class Support Vector Machine
    Wang, Xiaoming
    Tian, Yong
    Yang, Xiaohuan
    Du, Yajun
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: BIG DATA AND MACHINE LEARNING TECHNIQUES, ISCIDE 2015, PT II, 2015, 9243 : 76 - 85
  • [50] One-class classification for oil spill detection
    Gambardella, Attilio
    Giacinto, Giorgio
    Migliaccio, Maurizio
    Montali, Andrea
    PATTERN ANALYSIS AND APPLICATIONS, 2010, 13 (03) : 349 - 366