An Effective Lightweight Crowd Counting Method Based on an Encoder-Decoder Network for Internet of Video Things

被引：7

作者：

Yi, Jun ^{[1
]}

Chen, Fan ^{[1
]}

Shen, Zhilong ^{[2
]}

Xiang, Yi ^{[1
]}

Xiao, Shan ^{[3
]}

Zhou, Wei ^{[1
]}

机构：

[1] Chongqing Univ Sci & Technol, Coll Intelligent Technol & Engn, Chongqing 401331, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China

[3] Chongqing Coll Elect Engn, Inst Big Data & Optimizat, Chongqing 401331, Peoples R China

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Convolution neural network; crowd counting; edge computing; lightweight network;

D O I：

10.1109/JIOT.2023.3294727

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An emerging Internet of Video Things (IoVT) application, crowd counting is a computer vision task where the number of heads in a crowded scene is estimated. In recent years, it has attracted increasing attention from academia and industry because of its great potential value in public safety and urban planning. However, it has become a challenge to cross the gap between the increasingly heavy and complex network architecture widely used for the pursuit of counting with high accuracy and the constrained computing and storage resources in the edge computing environment. To address this issue, an effective lightweight crowd counting method based on an encoder-decoder network, named lightweight crowd counting network (LEDCrowdNet), is proposed to achieve an optimal tradeoff between counting performance and running speed for edge applications of IoVT. In particular, an improved MobileViT module as an encoder is designed to extract global-local crowd features of various scales. The decoder is composed of the adaptive multiscale large kernel attention module (AMLKA) and the lightweight counting atrous spatial pyramid pooling process module (LC-ASPP), which can perform end-to-end training to obtain the final density map. The proposed LEDCrowdNet is suitable for deployment on two edge computing platforms (NVIDIA Jetson Xavier NX and Coral Edge TPU) to reduce the number of floating point operations (FLOPs) without a significant drop in accuracy. Extensive experiments on five mainstream benchmarks (ShanghaiTech Part_A/B, UCF_CC_50, UCF-QNRF, WorldExpo'10, and RSOC data sets) verify the correctness and efficiency of our method.

引用

页码：3082 / 3094

页数：13

共 50 条

[31] Effective Video Summarization Using Channel Attention-Assisted Encoder-Decoder Framework
Alharbi, Faisal
Habib, Shabana
Albattah, Waleed
Jan, Zahoor
Alanazi, Meshari D.
Islam, Muhammad
SYMMETRY-BASEL, 2024, 16 (06):
[32] A classification method based on encoder-decoder structure with paper content
Yin, Yi
Ouyang, Lin
Wu, Zhixiang
Yin, Shuifang
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (09):
[33] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
Hongbo Bi
Huihui Zhu
Lina Yang
Ranwan Wu
Pattern Recognition and Image Analysis, 2022, 32 : 340 - 350
[34] SAR marine oil spill detection based on an encoder-decoder network
Soh, Kexin
Zhao, Lingli
Peng, Min
Lu, Jianzhong
Sun, Weidong
Tongngern, Suchada
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024, 45 (02) : 587 - 608
[35] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
Bi, Hongbo
Zhu, Huihui
Yang, Lina
Wu, Ranwan
PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (02) : 340 - 350
[36] Pooling Attention-based Encoder-Decoder Network for semantic segmentation
Xu, Haixia
Huang, Yunjia
Hancock, Edwin R.
Wang, Shuailong
Xuan, Qijun
Zhou, Wei
COMPUTERS & ELECTRICAL ENGINEERING, 2021, 93
[37] EDChannel: channel prediction of backscatter communication network based on encoder-decoder
Dengao Li
Yongxin Wen
Shuang Xu
Qiang Wang
Ruiqin Bai
Jumin Zhao
Telecommunication Systems, 2022, 81 : 99 - 114
[38] Dynamic video summarisation using stacked encoder-decoder architecture with residual learning network
Dhanushree, M.
Priya, R.
Aruna, P.
Bhavani, R.
INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2024, 12 (01) : 27 - 59
[39] Seismic internal multiple suppression method with encoder-decoder convolutional network based on data augmentation
Liu, Xiaozhou
Hu, Tianyue
Liu, Tao
Wei, Zhefeng
Xie, Fei
An, Shengpei
Shiyou Diqiu Wuli Kantan/Oil Geophysical Prospecting, 2022, 57 (04): : 757 - 767
[40] EDChannel: channel prediction of backscatter communication network based on encoder-decoder
Li, Dengao
Wen, Yongxin
Xu, Shuang
Wang, Qiang
Bai, Ruiqin
Zhao, Jumin
TELECOMMUNICATION SYSTEMS, 2022, 81 (01) : 99 - 114

← 1 2 3 4 5 →