Leveraging an Isolation Forest to Anomaly Detection and Data Clustering

被引:4
|
作者
Yepmo, Veronne [1 ]
Smits, Gregory [2 ]
Lesot, Marie -Jeanne [3 ]
Pivert, Olivier [1 ]
机构
[1] Univ Rennes, IRISA, Lannion, France
[2] Lab STICC, IMT Atlantique, Brest, France
[3] Sorbonne Univ, LIP6, Paris, France
关键词
Anomaly/outlier detection; Isolation forest; Clustering; FUZZY; ALGORITHM; NOISE;
D O I
10.1016/j.datak.2024.102302
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding why some points in a data set are considered as anomalies cannot be done without taking into account the structure of the regular points. Whereas many machine learning methods are dedicated to the identification of anomalies on one side, or to the identification of the data inner -structure on the other side, a solution is introduced to answers these two tasks using a same data model, a variant of an isolation forest. The initial algorithm to construct an isolation forest is indeed revisited to preserve the data inner structure without affecting the efficiency of the outlier detection. Experiments conducted both on synthetic and real -world data sets show that, in addition to improving the detection of abnormal data points, the proposed variant of isolation forest allows for a reconstruction of the subspaces of high density. Therefore, the former can serve as a basis for a unified approach to detect global and local anomalies, which is a necessary condition to then provide users with informative descriptions of the data.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Anomaly Detection for Data Streams Based on Isolation Forest Using Scikit-Multiflow
    Togbe, Maurras Ulbricht
    Barry, Mariam
    Boly, Aliou
    Chabchoub, Yousra
    Chiky, Raja
    Montiel, Jacob
    Tran, Vinh-Thuy
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2020, PART IV, 2020, 12252 : 15 - 30
  • [22] Anomaly Detection in Semiconductor Cleanroom Using Isolation Forest
    Jahan, Israt
    Alam, Md Morshed
    Ahmed, Md Faisal
    Jang, Yeong Min
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 795 - 797
  • [23] Isolation Mondrian Forest for Batch and Online Anomaly Detection
    Ma, Haoran
    Ghojogh, Benyamin
    Samad, Maria N.
    Zheng, Dongyu
    Crowley, Mark
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3051 - 3058
  • [24] Subspace analysis isolation forest for hyperspectral anomaly detection
    Huang Y.
    Xue Y.
    Li P.
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2021, 50 (03): : 416 - 425
  • [25] CADI: Contextual Anomaly Detection using an Isolation Forest
    Yepmo, Veronne
    Smits, Gregory
    Lesot, Marie-Jeanne
    Pivert, Olivier
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 935 - 944
  • [26] On the statistical properties of the isolation forest anomaly detection method
    Pelletier, Bruno
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (02): : 4322 - 4381
  • [27] Semi-Supervised Isolation Forest for Anomaly Detection
    Stradiotti, Luca
    Perini, Lorenzo
    Davis, Jesse
    PROCEEDINGS OF THE 2024 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2024, : 670 - 678
  • [28] Incremental Isolation Forest to Handle Concept Drift in Anomaly Detection
    Ahlawat, Nidhi
    Awekar, Amit
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 582 - 583
  • [29] Hyperspectral Anomaly Detection With Otsu-Based Isolation Forest
    Zhang, Yuxiang
    Dong, Yanni
    Wu, Ke
    Chen, Tao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 9079 - 9088
  • [30] Deep Optimal Isolation Forest with Genetic Algorithm for Anomaly Detection
    Xiang, Haolong
    Zhang, Xuyun
    Dras, Mark
    Beheshti, Amin
    Dou, Wanchun
    Xu, Xiaolong
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 678 - 687