Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object Discrimination

被引:2
|
作者
Zhao, Chenyang [1 ]
Hsiao, Janet H. [2 ]
Chan, Antoni B. [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Div Social Sci, Hong Kong, Peoples R China
关键词
Detectors; Visualization; Heat maps; Task analysis; Object detection; Predictive models; Transformers; Deep learning; explainable AI; explaining object detection; gradient-based explanation; human eye gaze; instance-level explanation; knowledge distillation; non-maximum suppression; object discrimination; object specification; NMS;
D O I
10.1109/TPAMI.2024.3380604
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visual explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute. Compared to previous works on classification activation maps (CAM), ODAM generates instance-specific explanations rather than class-specific ones. We show that ODAM is applicable to one-stage, two-stage, and transformer-based detectors with different types of detector backbones and heads, and produces higher-quality visual explanations than the state-of-the-art in terms of both effectiveness and efficiency. We discuss two explanation tasks for object detection: 1) object specification: what is the important region for the prediction? 2) object discrimination: which object is detected? Aiming at these two aspects, we present a detailed analysis of the visual explanations of detectors and carry out extensive experiments to validate the effectiveness of the proposed ODAM. Furthermore, we investigate user trust on the explanation maps, how well the visual explanations of object detectors agrees with human explanations, as measured through human eye gaze, and whether this agreement is related with user trust. Finally, we also propose two applications, ODAM-KD and ODAM-NMS, based on these two abilities of ODAM. ODAM-KD utilizes the object specification of ODAM to generate top-down attention for key predictions and instruct the knowledge distillation of object detection. ODAM-NMS considers the location of the model's explanation for each prediction to distinguish the duplicate detected objects. A training scheme, ODAM-Train, is proposed to improve the quality on object discrimination, and help with ODAM-NMS.
引用
收藏
页码:5967 / 5985
页数:19
相关论文
共 50 条
  • [1] Object Tracking Algorithm Based on Multi-Time-Space Perception and Instance-Specific Proposals
    Sun, Jinping
    Li, Dan
    Cheng, Honglin
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (01): : 655 - 675
  • [2] Object recognition with gradient-based learning
    LeCun, Y
    Haffner, P
    Bottou, L
    Bengio, Y
    SHAPE, CONTOUR AND GROUPING IN COMPUTER VISION, 1999, 1681 : 319 - 345
  • [3] Instance-specific 6-DoF Object Pose Estimation from Minimal Annotations
    Singh, Rohan P.
    Kumagai, Iori
    Gabas, Antonio
    Benallegue, Mehdi
    Yoshiyasu, Yusuke
    Kanehiro, Fumio
    2020 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2020, : 109 - 114
  • [4] Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
    Yang, Ziyan
    Kafle, Kushal
    Dernoncourt, Franck
    Ordonez, Vicente
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19165 - 19174
  • [5] Efficient image gradient-based object localisation and recognition
    Tan, TN
    Sullivan, GD
    Baker, KD
    1996 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1996, : 397 - 402
  • [6] Gradient-Based Quantification of Epistemic Uncertainty for Deep Object Detectors
    Riedlinger, Tobias
    Rottmann, Matthias
    Schubert, Marius
    Gottschalk, Hanno
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3910 - 3920
  • [7] Anisotropic gradient-based filtering for object segmentation in medical images
    Joao, Ana
    Gambaruto, Alberto
    Sequeira, Adelia
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2020, 8 (06): : 621 - 630
  • [8] Visual object tracking based on policy gradient
    Wang K.-H.
    Yin H.-B.
    Huang X.-F.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (10): : 1923 - 1928and1935
  • [9] An Instance to Extend Object-Z Formal Specification
    Hou, Xiaomao
    Ma, Ling
    Wen, Zhicheng
    ADVANCES IN MECHATRONICS, AUTOMATION AND APPLIED INFORMATION TECHNOLOGIES, PTS 1 AND 2, 2014, 846-847 : 1500 - 1504
  • [10] OBJECT PROPERTIES MEDIATING VISUAL OBJECT DISCRIMINATION IN THE CAT
    DAVES, WF
    BOOSTROM, E
    PERCEPTUAL AND MOTOR SKILLS, 1964, 19 (02) : 343 - 350