WNet: A dual-encoded multi-human parsing network

被引:0
|
作者
Hosen, Md Imran [1 ,2 ]
Aydin, Tarkan [2 ]
Islam, Md Baharul [2 ]
机构
[1] Manarat Int Univ, Dept Comp Sci & Engn, Dhaka, Bangladesh
[2] Bahcesehir Univ, Dept Comp Engn, Istanbul, Turkiye
关键词
computer vision; image processing; image segmentation; FRAMEWORK; POSE;
D O I
10.1049/ipr2.13176
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, multi-human parsing has become a focal point in research, yet prevailing methods often rely on intermediate stages and lacking pixel-level analysis. Moreover, their high computational demands limit real-world efficiency. To address these challenges and enable real-time performance, low-latency end-to-end network is proposed. This approach leverages vision transformer and convolutional neural network in a dual-encoded network, featuring a lightweight Transformer-based vision encoder) and a convolution encoder based on Darknet. This combination adeptly captures long-range dependencies and spatial relationships. Incorporating a fuse block enables the seamless merging of features from the encoders. Residual connections in the decoder design amplify information flow. Experimental validation on crowd instance-level human parsing and look into person datasets showcases the WNet's effectiveness, achieving high-speed multi-human parsing at 26.7 frames per second. Ablation studies further underscore WNet's capabilities, emphasizing its efficiency and accuracy in complex multi-human parsing tasks. We present WNet, a low-latency end-to-end network for multi-human parsing that integrates vision transformer and Convolutional Neural Network in a dual-encoded structure (vision encoder and a convolution encoder). By adeptly capturing long-range dependencies and spatial relationships, WNet achieves real-time performance and high-speed parsing at 26.7 frames per second on crowd instance-level human parsing and look into person datasets. The inclusion of a fuse block for seamless feature merging, along with residual connections in the decoder, amplifies information flow, emphasizing WNet's efficiency and accuracy in complex multi-human parsing tasks. image
引用
收藏
页码:3316 / 3328
页数:13
相关论文
共 50 条
  • [31] Optimal task allocation in multi-human multi-robot interaction
    Malvankar-Mehta, Monali S.
    Mehta, Siddhartha S.
    OPTIMIZATION LETTERS, 2015, 9 (08) : 1787 - 1803
  • [32] Dual-Encoded Affinity Microbead Signature Combinatorial Profiling for Acute Myocardial Infarction High-Sensitivity Diagnosis
    He, Luxuan
    Wu, Jiacheng
    Lin, Zhun
    Zhang, Yuanqing
    Liu, Peiqing
    ACS SENSORS, 2024, 9 (04): : 2083 - 2090
  • [33] Dual-Encoded Microbeads through a Host-Guest Structure: Enormous, Flexible, and Accurate Barcodes for Multiplexed Assays
    Zhang, Ding Sheng-zi
    Jiang, Yang
    Yang, Haiou
    Zhu, Youjie
    Zhang, Shunjia
    Zhu, Ying
    Wei, Dan
    Lin, Ye
    Wang, Pingping
    Fu, Qihua
    Xu, Hong
    Gu, Hongchen
    ADVANCED FUNCTIONAL MATERIALS, 2016, 26 (34) : 6146 - 6157
  • [34] Sense and Validate: Fluorophore/Mass Dual-Encoded Nanoprobes for Fluorescence Imaging and MS Quantification of Intracellular Multiple MicroRNAs
    Xu, Hongmei
    Zhang, Zhenzhen
    Wang, Yihan
    Zhang, Xuemeng
    Zhu, Jun-Jie
    Min, Qianhao
    ANALYTICAL CHEMISTRY, 2022, 94 (16) : 6329 - 6337
  • [35] Multi-Human Locating in Real Environment by Thermal Sensor
    Kuki, Masato
    Nakajima, Hiroshi
    Tsuchiya, Naoki
    Hata, Yutaka
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 4623 - 4628
  • [36] Eliciting Compatible Demonstrations for Multi-Human Imitation Learning
    Gandhi, Kanishk
    Karamcheti, Siddharth
    Liao, Madeline
    Sadigh, Dorsa
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1981 - 1991
  • [37] Multi-class Human Body Parsing with Edge-Enhancement Network
    Huang, Xi
    Wu, Keyu
    Hu, Gang
    Shao, Jie
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 466 - 477
  • [38] A Dual-Encoded Bead-Based Immunoassay with Tunable Detection Range for COVID-19 Serum Evaluation
    Lin, Zhun
    Zhang, Jie
    Zou, Zhengyu
    Lu, Gen
    Wu, Minhao
    Niu, Li
    Zhang, Yuanqing
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2022, 61 (37)
  • [39] Benchmarking the Complementary-View Multi-human Association and Tracking
    Han, Ruize
    Feng, Wei
    Wang, Feifan
    Qian, Zekun
    Yan, Haomin
    Wang, Song
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (01) : 118 - 136
  • [40] EgoHumans: An Egocentric 3D Multi-Human Benchmark
    Khirodkar, Rawal
    Bansal, Aayush
    Ma, Lingni
    Newcombe, Richard
    Vo, Minh
    Kitani, Kris
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19750 - 19762