An Effective Lightweight Crowd Counting Method Based on an Encoder-Decoder Network for Internet of Video Things

被引:7
|
作者
Yi, Jun [1 ]
Chen, Fan [1 ]
Shen, Zhilong [2 ]
Xiang, Yi [1 ]
Xiao, Shan [3 ]
Zhou, Wei [1 ]
机构
[1] Chongqing Univ Sci & Technol, Coll Intelligent Technol & Engn, Chongqing 401331, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
[3] Chongqing Coll Elect Engn, Inst Big Data & Optimizat, Chongqing 401331, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolution neural network; crowd counting; edge computing; lightweight network;
D O I
10.1109/JIOT.2023.3294727
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An emerging Internet of Video Things (IoVT) application, crowd counting is a computer vision task where the number of heads in a crowded scene is estimated. In recent years, it has attracted increasing attention from academia and industry because of its great potential value in public safety and urban planning. However, it has become a challenge to cross the gap between the increasingly heavy and complex network architecture widely used for the pursuit of counting with high accuracy and the constrained computing and storage resources in the edge computing environment. To address this issue, an effective lightweight crowd counting method based on an encoder-decoder network, named lightweight crowd counting network (LEDCrowdNet), is proposed to achieve an optimal tradeoff between counting performance and running speed for edge applications of IoVT. In particular, an improved MobileViT module as an encoder is designed to extract global-local crowd features of various scales. The decoder is composed of the adaptive multiscale large kernel attention module (AMLKA) and the lightweight counting atrous spatial pyramid pooling process module (LC-ASPP), which can perform end-to-end training to obtain the final density map. The proposed LEDCrowdNet is suitable for deployment on two edge computing platforms (NVIDIA Jetson Xavier NX and Coral Edge TPU) to reduce the number of floating point operations (FLOPs) without a significant drop in accuracy. Extensive experiments on five mainstream benchmarks (ShanghaiTech Part_A/B, UCF_CC_50, UCF-QNRF, WorldExpo'10, and RSOC data sets) verify the correctness and efficiency of our method.
引用
收藏
页码:3082 / 3094
页数:13
相关论文
共 50 条
  • [21] Temporal Extension for Encoder-Decoder-based Crowd Counting Approaches
    Golda, Thomas
    Kruger, Florian
    Beyerer, Jurgen
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [22] RED-Net: A Recurrent Encoder-Decoder Network for Video-Based Face Alignment
    Peng, Xi
    Feris, Rogerio S.
    Wang, Xiaoyu
    Metaxas, Dimitris N.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (10) : 1103 - 1119
  • [23] An Encoder-Decoder Network Based FCN Architecture for Semantic Segmentation
    Xing, Yongfeng
    Zhong, Luo
    Zhong, Xian
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2020, 2020
  • [24] Semantic Segmentation for Identifying Road Surface Damages Using Lightweight Encoder-Decoder Network
    Abdussyukur, Hafizh
    Sulistiyo, Mahmud Dwi
    Rachmawati, Ema
    Arief, Mansur Maturidi
    Kosala, Gamma
    Adiwijaya
    2022 INTERNATIONAL CONFERENCE ON ADVANCED CREATIVE NETWORKS AND INTELLIGENT SYSTEMS, ICACNIS, 2022, : 165 - 170
  • [25] Symmetry Encoder-Decoder Network with Attention Mechanism for Fast Video Object Segmentation
    Guo, Mingyue
    Zhang, Dejun
    Sun, Jun
    Wu, Yiqi
    SYMMETRY-BASEL, 2019, 11 (08):
  • [26] Wafer Pattern Counting, Detection and Classification Based on Encoder-Decoder CNN Structure
    Lin, Yu
    2022 INTERMOUNTAIN ENGINEERING, TECHNOLOGY AND COMPUTING (IETC), 2022,
  • [27] Image Semantic Segmentation Method Based on Context and Shallow Space Encoder-decoder Network
    Luo, Hui-Lan
    Li, Xiao
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (07): : 1834 - 1846
  • [28] Attention Based Encoder-decoder Network for Cardiac Semantic Segmentation
    Yuan, Xiaohan
    Zhu, Yinsu
    Wang, Yangang
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 4578 - 4582
  • [29] Counting in congested crowd scenes with hierarchical scale-aware encoder–decoder network
    Han, Run
    Qi, Ran
    Lu, Xuequan
    Huang, Lei
    Lyu, Lei
    Expert Systems with Applications, 2024, 238
  • [30] An Effective Classification Method for Hyperspectral Image With Very High Resolution Based on Encoder-Decoder Architecture
    Zhang, Zhen
    Jiang, Tao
    Liu, Chenxi
    Zhang, Linjing
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 1509 - 1519