Do Humans Look Where Deep Convolutional Neural Networks "Attend"?

被引：5

作者：

Ebrahimpour, Mohammad K. ^{[1
]}

Ben Falandays, J. ^{[2
]}

Spevack, Samuel ^{[2
]}

Noelle, David C. ^{[1
,2
]}

机构：

[1] Univ Calif, EECS, Merced, CA 95343 USA

[2] Univ Calif, Cognit & Informat Sci, Merced, CA USA

来源：

ADVANCES IN VISUAL COMPUTING, ISVC 2019, PT II | 2019年 / 11845卷

关键词：

Visual spatial attention; Computer vision; Convolutional Neural Networks; Densely connected attention maps; Class Activation Maps; Sensitivity analysis;

D O I：

10.1007/978-3-030-33723-0_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Convolutional Neural Networks (CNNs) have recently begun to exhibit human level performance on some visual perception tasks. Performance remains relatively poor, however, on some vision tasks, such as object detection: specifying the location and object class for all objects in a still image. We hypothesized that this gap in performance may be largely due to the fact that humans exhibit selective attention, while most object detection CNNs have no corresponding mechanism. In examining this question, we investigated some well-known attention mechanisms in the deep learning literature, identifying their weaknesses and leading us to propose a novel attention algorithm called the Densely Connected Attention Model. We then measured human spatial attention, in the form of eye tracking data, during the performance of an analogous object detection task. By comparing the learned representations produced by various CNN architectures with that exhibited by human viewers, we identified some relative strengths and weaknesses of the examined computational attention mechanisms. Some CNNs produced attentional patterns somewhat similar to those of humans. Others focused processing on objects in the foreground. Still other CNN attentional mechanisms produced usefully interpretable internal representations. The resulting comparisons provide insights into the relationship between CNN attention algorithms and the human visual system.

引用

页码：53 / 65

页数：13

共 50 条

[31] Predicting enhancers with deep convolutional neural networks
Min, Xu
Zeng, Wanwen
Chen, Shengquan
Chen, Ning
Chen, Ting
Jiang, Rui
BMC BIOINFORMATICS, 2017, 18
[32] Metaphase finding with deep convolutional neural networks
Moazzen, Yaser
Capar, Abdulkerim
Albayrak, Abdulkadir
Calik, Nurullah
Toreyin, Behcet Ugur
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2019, 52 : 353 - 361
[33] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
[34] Deep distributed convolutional neural networks: Universality
Zhou, Ding-Xuan
ANALYSIS AND APPLICATIONS, 2018, 16 (06) : 895 - 919
[35] Theory of deep convolutional neural networks: Downsampling
Zhou, Ding-Xuan
NEURAL NETWORKS, 2020, 124 : 319 - 327
[36] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[37] Deep convolutional neural networks in the face of caricature
Matthew Q. Hill
Connor J. Parde
Carlos D. Castillo
Y. Ivette Colón
Rajeev Ranjan
Jun-Cheng Chen
Volker Blanz
Alice J. O’Toole
Nature Machine Intelligence, 2019, 1 : 522 - 529
[38] Deep Convolutional Neural Networks on Cartoon Functions
Grohs, Philipp
Wiatowski, Thomas
Bolcskei, Helmut
2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1163 - 1167
[39] Elastography mapped by deep convolutional neural networks
Liu, DongXu
Kruggel, Frithjof
Sun, LiZhi
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2021, 64 (07) : 1567 - 1574
[40] Elastography mapped by deep convolutional neural networks
LIU DongXu
KRUGGEL Frithjof
SUN LiZhi
Science China(Technological Sciences), 2021, 64 (07) : 1567 - 1574

← 1 2 3 4 5 →