Interpretable regression trees using conformal prediction

被引:28
|
作者
Johansson, Ulf [1 ,2 ]
Linusson, Henrik [2 ]
Lofstrom, Tuve [1 ,2 ]
Bostromc, Henrik [3 ]
机构
[1] Jonkoping Univ, Dept Comp Sci & Informat, Jonkoping, Sweden
[2] Univ Boras, Dept Informat Technol, Boras, Sweden
[3] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Stockholm, Sweden
关键词
Conformal prediction; Interpretability; Predictive regression; Regression trees; ALGORITHMS;
D O I
10.1016/j.eswa.2017.12.041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A key property of conformal predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset level of confidence. For regression, this is achieved by turning the point predictions of the underlying model into prediction intervals. Thus, the most important performance metric for evaluating conformal regressors is not the error rate, but the size of the prediction intervals, where models generating smaller (more informative) intervals are said to be more efficient. State-of-the-art conformal regressors typically utilize two separate predictive models: the underlying model providing the center point of each prediction interval, and a normalization model used to scale each prediction interval according to the estimated level of difficulty for each test instance. When using a regression tree as the underlying model, this approach may cause test instances falling into a specific leaf to receive different prediction intervals. This clearly deteriorates the interpretability of a conformal regression tree compared to a standard regression tree, since the path from the root to a leaf can no longer be translated into a rule explaining all predictions in that leaf. In fact, the model cannot even be interpreted on its own, i.e., without reference to the corresponding normalization model. Current practice effectively presents two options for constructing conformal regression trees: to employ a (global) normalization model, and thereby sacrifice interpretability; or to avoid normalization, and thereby sacrifice both efficiency and individualized predictions. In this paper, two additional approaches are considered, both employing local normalization: the first approach estimates the difficulty by the standard deviation of the target values in each leaf, while the second approach employs Mondrian conformal prediction, which results in regression trees where each rule (path from root node to leaf node) is independently valid. An empirical evaluation shows that the first approach is as efficient as current state-of-the-art approaches, thus eliminating the efficiency vs. interpretability trade-off present in existing methods. Moreover, it is shown that if a validity guarantee is required for each single rule, as provided by the Mondrian approach, a penalty with respect to efficiency has to be paid, but it is only substantial at very high confidence levels. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:394 / 404
页数:11
相关论文
共 50 条
  • [1] Conformal Prediction Using Decision Trees
    Johansson, Ulf
    Bostrom, Henrik
    Lofstrom, Tuve
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 330 - 339
  • [2] Accurate and Interpretable Regression Trees using Oracle Coaching
    Johansson, Ulf
    Sonstrod, Cecilia
    Konig, Rikard
    2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2014, : 194 - 201
  • [3] Interpretable Quantile Regression by Optimal Decision Trees
    Lemaire, Valentin
    Aglin, Gael
    Nijssen, Siegfried
    ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024, 2024, 14642 : 210 - 222
  • [4] Prediction of ordinal classes using regression trees
    Kramer, S
    Pfahringer, B
    Widmer, G
    De Groeve, M
    FUNDAMENTA INFORMATICAE, 2001, 47 (1-2) : 1 - 13
  • [5] Identifying Homogeneous and Interpretable Groups for Conformal Prediction
    Gil, Natalia Martinez
    Patel, Dhaval
    Reddy, Chandra
    Ganapavarapu, Giridhar
    Vaculin, Roman
    Kalagnanam, Jayant
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2024, 244 : 2471 - 2485
  • [6] Synergy Conformal Prediction for Regression
    Gauraha, Niharika
    Spjuth, Ola
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 212 - 221
  • [7] Prediction of urban propagation loss using regression trees
    Gerome, D
    1997 IEEE 47TH VEHICULAR TECHNOLOGY CONFERENCE PROCEEDINGS, VOLS 1-3: TECHNOLOGY IN MOTION, 1997, : 1099 - 1102
  • [8] Interpretable and Reliable Rule Classification Based on Conformal Prediction
    Abdelqader, Husam
    Smirnov, Evgueni
    Pont, Marc
    Geijselaers, Marciano
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 1752 : 385 - 401
  • [9] Regression conformal prediction with random forests
    Ulf Johansson
    Henrik Boström
    Tuve Löfström
    Henrik Linusson
    Machine Learning, 2014, 97 : 155 - 176
  • [10] Regression Conformal Prediction with Nearest Neighbours
    Papadopoulos, Harris
    Vovk, Vladimir
    Gammerman, Alex
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2011, 40 : 815 - 840