WNet: A dual-encoded multi-human parsing network

被引：0

作者：

Hosen, Md Imran ^{[1
,2
]}

Aydin, Tarkan ^{[2
]}

Islam, Md Baharul ^{[2
]}

机构：

[1] Manarat Int Univ, Dept Comp Sci & Engn, Dhaka, Bangladesh

[2] Bahcesehir Univ, Dept Comp Engn, Istanbul, Turkiye

来源：

IET IMAGE PROCESSING | 2024年 / 18卷 / 12期

关键词：

computer vision; image processing; image segmentation; FRAMEWORK; POSE;

D O I：

10.1049/ipr2.13176

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, multi-human parsing has become a focal point in research, yet prevailing methods often rely on intermediate stages and lacking pixel-level analysis. Moreover, their high computational demands limit real-world efficiency. To address these challenges and enable real-time performance, low-latency end-to-end network is proposed. This approach leverages vision transformer and convolutional neural network in a dual-encoded network, featuring a lightweight Transformer-based vision encoder) and a convolution encoder based on Darknet. This combination adeptly captures long-range dependencies and spatial relationships. Incorporating a fuse block enables the seamless merging of features from the encoders. Residual connections in the decoder design amplify information flow. Experimental validation on crowd instance-level human parsing and look into person datasets showcases the WNet's effectiveness, achieving high-speed multi-human parsing at 26.7 frames per second. Ablation studies further underscore WNet's capabilities, emphasizing its efficiency and accuracy in complex multi-human parsing tasks. We present WNet, a low-latency end-to-end network for multi-human parsing that integrates vision transformer and Convolutional Neural Network in a dual-encoded structure (vision encoder and a convolution encoder). By adeptly capturing long-range dependencies and spatial relationships, WNet achieves real-time performance and high-speed parsing at 26.7 frames per second on crowd instance-level human parsing and look into person datasets. The inclusion of a fuse block for seamless feature merging, along with residual connections in the decoder, amplifies information flow, emphasizing WNet's efficiency and accuracy in complex multi-human parsing tasks. image

引用

页码：3316 / 3328

页数：13

共 50 条

[1] Multi-Human Parsing Machines
Li, Jianshu
Zhao, Jian
Chen, Yunpeng
Roy, Sujoy
Yan, Shuicheng
Feng, Jiashi
Sim, Terence
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 45 - 53
[2] Fine-Grained Multi-human Parsing
Zhao, Jian
Li, Jianshu
Liu, Hengzhu
Yan, Shuicheng
Feng, Jiashi
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (8-9) : 2185 - 2203
[3] Fine-Grained Multi-human Parsing
Jian Zhao
Jianshu Li
Hengzhu Liu
Shuicheng Yan
Jiashi Feng
International Journal of Computer Vision, 2020, 128 : 2185 - 2203
[4] Multi-human Parsing with Pose and Boundary Guidance
Du, Shuncheng
Wang, Yigang
Wu, Zizhao
PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2020, 2020, 12305 : 481 - 492
[5] Multi-human Parsing Based on Dynamic Convolution
Yan, Min
Zhang, Guoshan
Zhang, Tong
Zhang, Yueming
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 7185 - 7190
[6] Nondiscriminatory treatment: A straightforward framework for multi-human parsing
Yan, Min
Zhang, Guoshan
Zhang, Tong
Zhang, Yueming
NEUROCOMPUTING, 2021, 460 : 126 - 138
[7] MHCP-RCNN : Multi-Human Color Parsing Segmentation using Multi-Task Network
Abhilash, S. K.
Nookala, Venu Madhav
Babu, Adithya
Karthik, S.
Mithun, V. R.
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[8] UniParser: Multi-Human Parsing With Unified Correlation Representation Learning
Chu, Jiaming
Jin, Lei
Teng, Yinglei
Li, Jianshu
Wei, Yunchao
Wang, Zheng
Xing, Junliang
Yan, Shuicheng
Zhao, Jian
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5159 - 5171
[9] REAL-TIME MULTI-HUMAN PARSING ON EMBEDDED DEVICES
Agyeman, Rockson
Rinner, Bernhard
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5145 - 5149
[10] Multi-human Parsing with a Graph-based Generative Adversarial Model
Li, Jianshu
Zhao, Jian
Lang, Congyan
Li, Yidong
Wei, Yunchao
Guo, Guodong
Sim, Terence
Yan, Shuicheng
Feng, Jiashi
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)

← 1 2 3 4 5 →