DA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {DA}^2$$\end{document}Net: a dual attention-aware network for robust crowd counting

被引:0
|
作者
Wenzhe Zhai
Qilei Li
Ying Zhou
Xuesong Li
Jinfeng Pan
Guofeng Zou
Mingliang Gao
机构
[1] Shandong University of Technology,School of Electrical and Electronic Engineering
[2] Queen Mary University of London,School of Electronic Engineering and Computer Science
[3] Shandong University,School of Information Science and Engineering
关键词
Crowd counting; Density estimation; Attention mechanism; Convolutional neural network;
D O I
10.1007/s00530-021-00877-4
中图分类号
学科分类号
摘要
Crowd counting in congested scenes is a crucial yet challenging task in video surveillance and urban security system. The performance of crowd counting has been greatly boosted with the rapid development of deep learning. However, robust crowd counting in high-density environment with scale variations remains under-explored. To address this problem, we propose a dual attention-aware network (DA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {DA}^2$$\end{document}Net) for robust crowd counting in dense crowd scene with scale variations. Specifically, the DA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {DA}^2$$\end{document}Net consists of two modules, namely Spatial Attention (SA) module and Channel Attention (CA) module. The SA module focuses on the spatial dependencies in the whole feature map to locate the heads accurately. The CA module attempts to handle the relations between channel maps and highlights the discriminative information in specific channels. Thus, it alleviates the mistaken estimation for background regions. The interactions between SA module and CA module provide the synergy which facilitates the learning of discriminative features with a focus on the essential head region. Experimental results on five benchmark datasets, i.e., ShanghaiTech, UCF_CC_50, UCF-QNRF, WorldExpo’10, and NWPU, demonstrate that the DA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {DA}^2$$\end{document}Net can achieve the state-of-the-art performance on both accuracy and robustness.
引用
收藏
页码:3027 / 3040
页数:13
相关论文
共 50 条