Classifying real-world data with the DDα-procedure

被引:0
|
作者
Mozharovskyi, Pavlo [1 ]
Mosler, Karl [1 ]
Lange, Tatjana [2 ]
机构
[1] Univ Cologne, Albertus Magnus Pl, D-50923 Cologne, Germany
[2] Hsch Merseburg, D-06217 Merseburg, Germany
关键词
Classification; Supervised learning; Alpha-procedure; Data depth; Spatial depth; Projection depth; Random Tukey depth; Outsiders; Features; DATA DEPTH; CLASSIFICATION; REGRESSION;
D O I
10.1007/s11634-014-0180-8
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The -classifier, a nonparametric fast and very robust procedure, is described and applied to fifty classification problems regarding a broad spectrum of real-world data. The procedure first transforms the data from their original property space into a depth space, which is a low-dimensional unit cube, and then separates them by a projective invariant procedure, called -procedure. To each data point the transformation assigns its depth values with respect to the given classes. Several alternative depth notions (spatial depth, Mahalanobis depth, projection depth, and Tukey depth, the latter two being approximated by univariate projections) are used in the procedure, and compared regarding their average error rates. With the Tukey depth, which fits the distributions' shape best and is most robust, 'outsiders', that is data points having zero depth in all classes, appear. They need an additional treatment for classification. Evidence is also given about the dimension of the extended feature space needed for linear separation. The -procedure is available as an R-package.
引用
收藏
页码:287 / 314
页数:28
相关论文
共 50 条
  • [1] Real-World Battles with Real-World Data
    Brown, Jeffrey
    Bate, Andrew
    Platt, Robert
    Raebel, Marsha
    Sauer, Brian
    Trifiro, Gianluca
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2017, 26 : 254 - 255
  • [2] Real-world study: from real-world data to real-world evidence
    Wen, Yi
    TRANSLATIONAL BREAST CANCER RESEARCH, 2020, 1
  • [3] Translating real-world evidence/real-world data
    Ravenstijn, Paulien
    CTS-CLINICAL AND TRANSLATIONAL SCIENCE, 2024, 17 (05):
  • [4] Driving Style Analysis by Classifying Real-World Data with Support Vector Clustering
    Feng, Yuxiang
    Pickering, Simon
    Chappell, Edward
    Iravani, Pejman
    Brace, Chris
    2018 3RD IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION ENGINEERING (ICITE), 2018, : 264 - 268
  • [5] REAL-WORLD DATA
    STROCK, JM
    POLICY REVIEW, 1993, 63 : 96 - 96
  • [6] Data Science Methods for Real-World Evidence Generation in Real-World Data
    Liu, Fang
    ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, 2024, 7 : 201 - 224
  • [7] Strategies to Turn Real-world Data Into Real-world Knowledge
    Hong, Julian C.
    JAMA NETWORK OPEN, 2021, 4 (10)
  • [8] Deriving Real-World Insights From Real-World Data
    Baker, Stuart G.
    ANNALS OF INTERNAL MEDICINE, 2019, 170 (09) : 664 - 665
  • [9] A Data Alignment and Compression Procedure for Real-world Residual Demand Curves
    Ruan, Guangchun
    Zhong, Haiwang
    Xia, Qing
    Shan, Baoguo
    Tan, Xiandong
    2020 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2020,
  • [10] Assessing Real-World Data Quality: The Application of Patient Registry Quality Criteria to Real-World Data and Real-World Evidence
    Richard E. Gliklich
    Michelle B. Leavy
    Therapeutic Innovation & Regulatory Science, 2020, 54 : 303 - 307