Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object Discrimination

被引：2

作者：

Zhao, Chenyang ^{[1
]}

Hsiao, Janet H. ^{[2
]}

Chan, Antoni B. ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

[2] Hong Kong Univ Sci & Technol, Div Social Sci, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 09期

关键词：

Detectors; Visualization; Heat maps; Task analysis; Object detection; Predictive models; Transformers; Deep learning; explainable AI; explaining object detection; gradient-based explanation; human eye gaze; instance-level explanation; knowledge distillation; non-maximum suppression; object discrimination; object specification; NMS;

D O I：

10.1109/TPAMI.2024.3380604

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visual explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute. Compared to previous works on classification activation maps (CAM), ODAM generates instance-specific explanations rather than class-specific ones. We show that ODAM is applicable to one-stage, two-stage, and transformer-based detectors with different types of detector backbones and heads, and produces higher-quality visual explanations than the state-of-the-art in terms of both effectiveness and efficiency. We discuss two explanation tasks for object detection: 1) object specification: what is the important region for the prediction? 2) object discrimination: which object is detected? Aiming at these two aspects, we present a detailed analysis of the visual explanations of detectors and carry out extensive experiments to validate the effectiveness of the proposed ODAM. Furthermore, we investigate user trust on the explanation maps, how well the visual explanations of object detectors agrees with human explanations, as measured through human eye gaze, and whether this agreement is related with user trust. Finally, we also propose two applications, ODAM-KD and ODAM-NMS, based on these two abilities of ODAM. ODAM-KD utilizes the object specification of ODAM to generate top-down attention for key predictions and instruct the knowledge distillation of object detection. ODAM-NMS considers the location of the model's explanation for each prediction to distinguish the duplicate detected objects. A training scheme, ODAM-Train, is proposed to improve the quality on object discrimination, and help with ODAM-NMS.

引用

页码：5967 / 5985

页数：19

共 50 条

[41] MULTIPLE INSTANCE LEARNING USING VISUAL PHRASES FOR OBJECT CLASSIFICATION
Song, Yan
Tian, Qi
Wang, Mengyue
Liu, Heng
Dai, Lirong
2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 649 - 654
[42] Grad-CAM plus plus : Generalized Gradient-based Visual Explanations for Deep Convolutional Networks
Chattopadhay, Aditya
Sarkar, Anirban
Howlader, Prantik
Balasubramanian, Vineeth N.
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 839 - 847
[43] Gradient-based Counterfactual Generation for Sparse and Diverse Counterfactual Explanations
Han, Chan Sik
Lee, Keon Myung
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1240 - 1247
[44] Gradient-based Maximally Interfered Retrieval for Domain Incremental 3D Object Detection
Nisar, Barza
Anand, Hruday Vishal Kanna
Waslander, Steven L.
2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 304 - 311
[45] Weakly Supervised Object Detection via Object-Specific Pixel Gradient
Shen, Yunhang
Ji, Rongrong
Wang, Changhu
Li, Xi
Li, Xuelong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5960 - 5970
[46] Improving Interpretability for Computer-Aided Diagnosis Tools on Whole Slide Imaging with Multiple Instance Learning and Gradient-Based Explanations
Pirovano, Antoine
Heuberger, Hippolyte
Berlemont, Sylvain
Ladjal, Said
Bloch, Isabelle
INTERPRETABLE AND ANNOTATION-EFFICIENT LEARNING FOR MEDICAL IMAGE COMPUTING, IMIMIC 2020, MIL3ID 2020, LABELS 2020, 2020, 12446 : 43 - 53
[47] Instance-specific multi-objective parameter tuning based on fuzzy logic
Ries, Jana
Beullens, Patrick
Salt, David
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 218 (02) : 305 - 315
[48] Long-term Object Tracking with Instance Specific Proposals
Liu, Hao
Hu, Qingyong
Li, Biao
Guo, Yulan
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1628 - 1633
[49] Visual object tracking with online weighted chaotic multiple instance learning
Abdechiri, Marjan
Faez, Karim
Amindavar, Hamidreza
NEUROCOMPUTING, 2017, 247 : 16 - 30
[50] INSTANCE FLOW BASED ONLINE MULTIPLE OBJECT TRACKING
Bullinger, Sebastian
Bodensteiner, Christoph
Arens, Michael
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 785 - 789

← 1 2 3 4 5 →