An Effective Lightweight Crowd Counting Method Based on an Encoder-Decoder Network for Internet of Video Things

被引:7
|
作者
Yi, Jun [1 ]
Chen, Fan [1 ]
Shen, Zhilong [2 ]
Xiang, Yi [1 ]
Xiao, Shan [3 ]
Zhou, Wei [1 ]
机构
[1] Chongqing Univ Sci & Technol, Coll Intelligent Technol & Engn, Chongqing 401331, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
[3] Chongqing Coll Elect Engn, Inst Big Data & Optimizat, Chongqing 401331, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolution neural network; crowd counting; edge computing; lightweight network;
D O I
10.1109/JIOT.2023.3294727
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An emerging Internet of Video Things (IoVT) application, crowd counting is a computer vision task where the number of heads in a crowded scene is estimated. In recent years, it has attracted increasing attention from academia and industry because of its great potential value in public safety and urban planning. However, it has become a challenge to cross the gap between the increasingly heavy and complex network architecture widely used for the pursuit of counting with high accuracy and the constrained computing and storage resources in the edge computing environment. To address this issue, an effective lightweight crowd counting method based on an encoder-decoder network, named lightweight crowd counting network (LEDCrowdNet), is proposed to achieve an optimal tradeoff between counting performance and running speed for edge applications of IoVT. In particular, an improved MobileViT module as an encoder is designed to extract global-local crowd features of various scales. The decoder is composed of the adaptive multiscale large kernel attention module (AMLKA) and the lightweight counting atrous spatial pyramid pooling process module (LC-ASPP), which can perform end-to-end training to obtain the final density map. The proposed LEDCrowdNet is suitable for deployment on two edge computing platforms (NVIDIA Jetson Xavier NX and Coral Edge TPU) to reduce the number of floating point operations (FLOPs) without a significant drop in accuracy. Extensive experiments on five mainstream benchmarks (ShanghaiTech Part_A/B, UCF_CC_50, UCF-QNRF, WorldExpo'10, and RSOC data sets) verify the correctness and efficiency of our method.
引用
收藏
页码:3082 / 3094
页数:13
相关论文
共 50 条
  • [1] Attentive encoder-decoder networks for crowd counting
    Liu, Xuhui
    Hu, Yutao
    Zhang, Baochang
    Zhen, Xiantong
    Luo, Xiaoyan
    Cao, Xianbin
    NEUROCOMPUTING, 2022, 490 : 246 - 257
  • [2] Attentive encoder-decoder networks for crowd counting
    Liu, Xuhui
    Hu, Yutao
    Zhang, Baochang
    Zhen, Xiantong
    Luo, Xiaoyan
    Cao, Xianbin
    Neurocomputing, 2022, 490 : 246 - 257
  • [3] An encoder-decoder network for crowd counting based on multi-scale attention mechanism
    Chuang H.-H.
    Chen Y.-C.
    Lin C.H.
    Multimedia Tools and Applications, 2025, 84 (03) : 1187 - 1210
  • [4] Multi-scale Supervised Attentive Encoder-Decoder Network for Crowd Counting
    Zhang, Anran
    Jiang, Xiaolong
    Zhang, Baochang
    Cao, Xianbin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (01)
  • [5] Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks
    Jiang, Xiaolong
    Xiao, Zehao
    Zhang, Baochang
    Zhen, Xiantong
    Cao, Xianbin
    Doermann, David
    Shao, Ling
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6126 - 6135
  • [6] Counting in congested crowd scenes with hierarchical scale-aware encoder-decoder network
    Han, Run
    Qi, Ran
    Lu, Xuequan
    Huang, Lei
    Lyu, Lei
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [7] SENetCount: An Optimized Encoder-Decoder Architecture with Squeeze-and-Excitation for Crowd Counting
    Meng, Xiaolong
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [8] MobileCount: An efficient encoder-decoder framework for real-time crowd counting
    Wang, Peng
    Gao, Chenyu
    Wang, Yang
    Li, Hui
    Gao, Ye
    NEUROCOMPUTING, 2020, 407 : 292 - 299
  • [9] A lightweight encoder-decoder network for automatic pavement crack detection
    Zhu, Guijie
    Liu, Jiacheng
    Fan, Zhun
    Yuan, Duan
    Ma, Peili
    Wang, Meihua
    Sheng, Weihua
    Wang, Kelvin C. P.
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, 39 (12) : 1743 - 1765
  • [10] Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting
    Thanasutives, Pongpisit
    Fukui, Ken-ichi
    Numao, Masayuki
    Kijsirikul, Boonserm
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2382 - 2389