Self-attention Guidance Based Crowd Localization and Counting

被引:1
|
作者
Ma, Zhouzhou [1 ,2 ]
Gu, Guanghua [1 ,2 ]
Zhao, Wenrui [1 ,2 ]
机构
[1] Yanshan Univ, Sch Informat Sci & Engn, Qinhuangdao 066000, Peoples R China
[2] Hebei Key Lab Informat Transmiss & Signal Proc, Qinhuangdao 066000, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowd localization; crowd counting; transformer; point supervision; object detection; IMAGE; NETWORK;
D O I
10.1007/s11633-023-1428-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most existing studies on crowd analysis are limited to the level of counting, which cannot provide the exact location of individuals. This paper proposes a self-attention guidance based crowd localization and counting network (SA-CLCN), which can simultaneously locate and count crowds. We take the form of object detection, using the original point annotations of crowd datasets as supervision to train the network. Ultimately, the center point coordinate of each head as well as the number of crowds are predicted. Specifically, to cope with the spatial and positional variations of the crowd, the proposed method introduces transformer to construct a globallocal feature extractor (GLFE) together with the convolutional structure. It establishes the near-to-far dependency between elements so that the global context and local detail features of the crowd image can be extracted simultaneously. Then, this paper designs a pyramid feature fusion module (PFFM) to fuse the global and local information from high level to low level to obtain a multiscale feature representation. In downstream tasks, this paper predicts candidate point offsets and confidence scores by a simple regression header and classification header. In addition, the Hungarian algorithm is used to match the predicted point set and the labelled point set to facilitate the calculation of losses. The proposed network avoids the errors or higher costs associated with using traditional density maps or bounding box annotations. Importantly, we have conducted extensive experiments on several crowd datasets, and the proposed method has produced competitive results in both counting and localization.
引用
收藏
页码:966 / 982
页数:17
相关论文
共 50 条
  • [41] Individualized tourism recommendation based on self-attention
    Liu, Guangjie
    Ma, Xin
    Zhu, Jinlong
    Zhang, Yu
    Yang, Danyang
    Wang, Jianfeng
    Wang, Yi
    PLOS ONE, 2022, 17 (08):
  • [42] CrossNet: Boosting Crowd Counting with Localization
    Zhang, Ji
    Cheng, Zhi-Qi
    Wu, Xiao
    Li, Wei
    Qiao, Jian-Jun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6436 - 6444
  • [43] SHYNESS AND SELF-ATTENTION
    CROZIER, WR
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1983, 36 (FEB): : A5 - A5
  • [44] DCGSA: A global self-attention network with dilated convolution for crowd density map generating
    Zhu, Liping
    Li, Chengyang
    Wang, Bing
    Yuan, Kun
    Yang, Zhongguo
    NEUROCOMPUTING, 2020, 378 : 455 - 466
  • [45] LSAGNet: lightweight self-attention guidance network for image super-resolution
    Ye, Shutong
    Zhu, Yi
    Zhang, Mingming
    Dai, Xinyan
    Zhao, Shengyu
    Xie, Chao
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (06)
  • [46] Attention and self-attention in random forests
    Utkin, Lev V.
    Konstantinov, Andrei V.
    Kirpichenko, Stanislav R.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2023, 12 (03) : 257 - 273
  • [47] Attention and self-attention in random forests
    Lev V. Utkin
    Andrei V. Konstantinov
    Stanislav R. Kirpichenko
    Progress in Artificial Intelligence, 2023, 12 : 257 - 273
  • [48] Double Attention: An Optimization Method for the Self-Attention Mechanism Based on Human Attention
    Zhang, Zeyu
    Li, Bin
    Yan, Chenyang
    Furuichi, Kengo
    Todo, Yuki
    BIOMIMETICS, 2025, 10 (01)
  • [49] Crowd counting method based on cross column fusion attention mechanism
    Cui, Xiao
    Zhang, Zhi-Feng
    Zheng, Qian
    Cao, Jie
    JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (03)
  • [50] Crowd counting in complex scenes based on an attention aware CNN network
    Li, Zhaoxin
    Lu, Shuhua
    Lan, Lingqiang
    Liu, Qiyuan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87