When are Post-hoc Conceptual Explanations Identifiable?

被引:0
|
作者
Leemann, Tobias [1 ,2 ]
Kirchhof, Michael [1 ]
Rong, Yao [1 ,2 ]
Kasneci, Enkelejda [2 ]
Kasneci, Gjergji [2 ]
机构
[1] Univ Tubingen, Tubingen, Germany
[2] Tech Univ Munich, Munich, Germany
来源
关键词
INDEPENDENT COMPONENT ANALYSIS; NONLINEAR ICA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable concepts like object shape or color that can provide post-hoc explanations for decisions. Unlike previous work, we argue that concept discovery should be identifiable, meaning that a number of known concepts can be provably recovered to guarantee reliability of the explanations. As a starting point, we explicitly make the connection between concept discovery and classical methods like Principal Component Analysis and Independent Component Analysis by showing that they can recover independent concepts under non-Gaussian distributions. For dependent concepts, we propose two novel approaches that exploit functional compositionality properties of image-generating processes. Our provably identifiable concept discovery methods substantially outperform competitors on a battery of experiments including hundreds of trained models and dependent concepts, where they exhibit up to 29% better alignment with the ground truth. Our results highlight the strict conditions under which reliable concept discovery without human labels can be guaranteed and provide a formal foundation for the domain. Our code is available online.
引用
收藏
页码:1207 / 1218
页数:12
相关论文
共 50 条
  • [31] Generating post-hoc explanations for Skip-gram-based node embeddings by identifying important nodes with bridgeness
    Park, Hogun
    Neville, Jennifer
    NEURAL NETWORKS, 2023, 164 : 546 - 561
  • [32] Augmenting post-hoc explanations for predictive process monitoring with uncertainty quantification via conformalized Monte Carlo dropout
    Mehdiyev, Nijat
    Majlatow, Maxim
    Fettke, Peter
    DATA & KNOWLEDGE ENGINEERING, 2025, 156
  • [33] Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies
    Kenny, Eoin M.
    Ford, Courtney
    Quinn, Molly
    Keane, Mark T.
    ARTIFICIAL INTELLIGENCE, 2021, 294
  • [34] Post-hoc data analysis: benefits and limitations
    Curran-Everett, Douglas
    Milgrom, Henry
    CURRENT OPINION IN ALLERGY AND CLINICAL IMMUNOLOGY, 2013, 13 (03) : 223 - 224
  • [35] Limitations of Post-Hoc Feature Alignment for Robustness
    Burns, Collin
    Steinhardt, Jacob
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2525 - 2533
  • [36] Post-hoc Counterfactual Generation with Supervised Autoencoder
    Guyomard, Victor
    Fessant, Francoise
    Bouadi, Tassadit
    Guyet, Thomas
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 105 - 114
  • [37] Studying synaptic efficiency by post-hoc immunolabelling
    Ramirez-Franco, Jorge
    Alonso, Beatris
    Bartolome-Martin, David
    Sanchez-Prieto, Jose
    Torres, Magdalena
    BMC NEUROSCIENCE, 2013, 14
  • [38] Post-hoc comparison tests for odds ratios
    Yilmaz, Ayfer Ezgi
    Altunay, Serpil Aktas
    ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2022, 15 (01) : 75 - 94
  • [39] The use of poetry in qualitative post-hoc analysis
    Brown, Candace S.
    JOURNAL OF POETRY THERAPY, 2018, 31 (02) : 107 - 112
  • [40] Post-hoc Derivation of MDI Flat Fields
    Potts, H. E.
    Diver, D. A.
    SOLAR-STELLAR DYNAMOS AS REVEALED BY HELIO AND ASTEROSEISMOLOGY: GONG 2008/SOHO 21, 2009, 416 : 239 - 244