Towards explainable deep visual saliency models

被引：2

作者：

Malladi, Sai Phani Kumar ^{[1
]}

Mukherjee, Jayanta ^{[2
]}

Larabi, Mohamed-Chaker ^{[3
]}

Chaudhury, Santanu ^{[4
]}

机构：

[1] Indian Inst Technol, Adv Technol Dev Ctr, Kharagpur, India

[2] IIT Kharagpur, Dept Comp Sci & Engn, Kharagpur, India

[3] Univ Poitiers, XLIM UMR CNRS 7252, Poitiers, France

[4] IIT Jodhpur, Dept Comp Sci & Engn, Jodhpur, India

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2023年 / 235卷

关键词：

Explainable saliency; Human perception; Log-Gabor filters; Color perception; ATTENTION;

D O I：

10.1016/j.cviu.2023.103782

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks have shown their profound impact on achieving human-level performance in visual saliency prediction. However, it is still unclear how they learn their task and what it means in terms of understanding human visual system. In this work, we propose a framework to derive explainable saliency models from their corresponding deep architectures. Mainly, we explain a deep saliency model by under-standing its four different aspects: (1) intermediate activation maps of deep layers, (2) biologically plausible Log-Gabor (LG) filters for salient region identification, (3) positional biased behavior of Log-Gabor filters and (4) processing of color information by establishing a relevance with human visual system. We consider four state-of-the-art (SOTA) deep saliency models, namely CMRNet, UNISAL, DeepGaze IIE, and MSI-Net for their interpretation using our proposed framework. We observe that explainable models perform way better than the classical SOTA models. We also find that CMRNet transforms the input RGB space to a representation after the input layer, which is very close to YUV space of a color image. Then, we discuss about the biological consideration and relevance of our framework for its possible anatomical substratum of visual attention. We find a good correlation between components of HVS and the base operations of the proposed technique. Hence, we say that this generic explainable framework provides a new perspective to see relationship between classical methods/human visual system and DNN based ones.

引用

页数：14

共 50 条

[21] Towards A Visual Programming Tool to Create Deep Learning Models
Calo, Tommaso
De Russis, Luigi
COMPANION OF THE 2023 ACM SIGCHI SYMPOSIUM ON ENGINEERING INTERACTIVE COMPUTING SYSTEMS, EICS 2023, 2023, : 38 - 44
[22] Temporal Saliency Detection Towards Explainable Transformer-Based Timeseries Forecasting
Nghia Duong-Trung
Duc-Manh Nguyen
Danh Le-Phuoc
ARTIFICIAL INTELLIGENCE-ECAI 2023 INTERNATIONAL WORKSHOPS, PT 1, XAI3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, 2023, 2024, 1947 : 250 - 268
[23] Towards explainable deep neural networks (xDNN)
Angelov, Plamen
Soares, Eduardo
NEURAL NETWORKS, 2020, 130 (130) : 185 - 194
[24] COMPARISON OF VISUAL SALIENCY MODELS FOR COMPRESSED VIDEO
Khatoonabadi, Sayed Hossein
Bajic, Ivan V.
Shan, Yufeng
2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 1081 - 1085
[25] A Robust Metric for the Evaluation of Visual Saliency Models
Sharma, Puneet
Alsam, Ali
PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 2, 2014, : 654 - 661
[26] ON THE ROLE OF CONTEXT IN PROBABILISTIC MODELS OF VISUAL SALIENCY
Bruce, Neil D. B.
Kornprobst, Pierre
2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 3089 - 3092
[27] Visual Saliency Models Based on Spectrum Processing
Zhao, Bin
Delp, Edward J.
2015 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2015, : 976 - 981
[28] DO DEEP-LEARNING SALIENCY MODELS REALLY MODEL SALIENCY?
Kong, Phutphalla
Mancas, Matei
Thuon, Nimol
Kheang, Seng
Gosselin, Bernard
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2331 - 2335
[29] Explainable Deep Learning for Glaucomatous Visual Field Prediction: Artifact Correction Enhances Transformer Models
Sriwatana, Kornchanok
Puttanawarut, Chanon
Suwan, Yanin
Achakulvisut, Titipat
TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2025, 14 (01):
[30] Going Beyond Saliency Maps: Training Deep Models to Interpret Deep Models
Liu, Zixuan
Adeli, Ehsan
Pohl, Kilian M.
Zhao, Qingyu
INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2021, 2021, 12729 : 71 - 82

← 1 2 3 4 5 →