An Effective Lightweight Crowd Counting Method Based on an Encoder-Decoder Network for Internet of Video Things

被引:7
|
作者
Yi, Jun [1 ]
Chen, Fan [1 ]
Shen, Zhilong [2 ]
Xiang, Yi [1 ]
Xiao, Shan [3 ]
Zhou, Wei [1 ]
机构
[1] Chongqing Univ Sci & Technol, Coll Intelligent Technol & Engn, Chongqing 401331, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
[3] Chongqing Coll Elect Engn, Inst Big Data & Optimizat, Chongqing 401331, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolution neural network; crowd counting; edge computing; lightweight network;
D O I
10.1109/JIOT.2023.3294727
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An emerging Internet of Video Things (IoVT) application, crowd counting is a computer vision task where the number of heads in a crowded scene is estimated. In recent years, it has attracted increasing attention from academia and industry because of its great potential value in public safety and urban planning. However, it has become a challenge to cross the gap between the increasingly heavy and complex network architecture widely used for the pursuit of counting with high accuracy and the constrained computing and storage resources in the edge computing environment. To address this issue, an effective lightweight crowd counting method based on an encoder-decoder network, named lightweight crowd counting network (LEDCrowdNet), is proposed to achieve an optimal tradeoff between counting performance and running speed for edge applications of IoVT. In particular, an improved MobileViT module as an encoder is designed to extract global-local crowd features of various scales. The decoder is composed of the adaptive multiscale large kernel attention module (AMLKA) and the lightweight counting atrous spatial pyramid pooling process module (LC-ASPP), which can perform end-to-end training to obtain the final density map. The proposed LEDCrowdNet is suitable for deployment on two edge computing platforms (NVIDIA Jetson Xavier NX and Coral Edge TPU) to reduce the number of floating point operations (FLOPs) without a significant drop in accuracy. Extensive experiments on five mainstream benchmarks (ShanghaiTech Part_A/B, UCF_CC_50, UCF-QNRF, WorldExpo'10, and RSOC data sets) verify the correctness and efficiency of our method.
引用
收藏
页码:3082 / 3094
页数:13
相关论文
共 50 条
  • [31] Effective Video Summarization Using Channel Attention-Assisted Encoder-Decoder Framework
    Alharbi, Faisal
    Habib, Shabana
    Albattah, Waleed
    Jan, Zahoor
    Alanazi, Meshari D.
    Islam, Muhammad
    SYMMETRY-BASEL, 2024, 16 (06):
  • [32] A classification method based on encoder-decoder structure with paper content
    Yin, Yi
    Ouyang, Lin
    Wu, Zhixiang
    Yin, Shuifang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (09):
  • [33] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
    Hongbo Bi
    Huihui Zhu
    Lina Yang
    Ranwan Wu
    Pattern Recognition and Image Analysis, 2022, 32 : 340 - 350
  • [34] SAR marine oil spill detection based on an encoder-decoder network
    Soh, Kexin
    Zhao, Lingli
    Peng, Min
    Lu, Jianzhong
    Sun, Weidong
    Tongngern, Suchada
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024, 45 (02) : 587 - 608
  • [35] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
    Bi, Hongbo
    Zhu, Huihui
    Yang, Lina
    Wu, Ranwan
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (02) : 340 - 350
  • [36] Pooling Attention-based Encoder-Decoder Network for semantic segmentation
    Xu, Haixia
    Huang, Yunjia
    Hancock, Edwin R.
    Wang, Shuailong
    Xuan, Qijun
    Zhou, Wei
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 93
  • [37] EDChannel: channel prediction of backscatter communication network based on encoder-decoder
    Dengao Li
    Yongxin Wen
    Shuang Xu
    Qiang Wang
    Ruiqin Bai
    Jumin Zhao
    Telecommunication Systems, 2022, 81 : 99 - 114
  • [38] Dynamic video summarisation using stacked encoder-decoder architecture with residual learning network
    Dhanushree, M.
    Priya, R.
    Aruna, P.
    Bhavani, R.
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2024, 12 (01) : 27 - 59
  • [39] Seismic internal multiple suppression method with encoder-decoder convolutional network based on data augmentation
    Liu, Xiaozhou
    Hu, Tianyue
    Liu, Tao
    Wei, Zhefeng
    Xie, Fei
    An, Shengpei
    Shiyou Diqiu Wuli Kantan/Oil Geophysical Prospecting, 2022, 57 (04): : 757 - 767
  • [40] EDChannel: channel prediction of backscatter communication network based on encoder-decoder
    Li, Dengao
    Wen, Yongxin
    Xu, Shuang
    Wang, Qiang
    Bai, Ruiqin
    Zhao, Jumin
    TELECOMMUNICATION SYSTEMS, 2022, 81 (01) : 99 - 114