Pedestrian Attribute Recognition Based on Multimodal Transformer

被引:0
|
作者
Liu, Dan [1 ]
Song, Wei [1 ,2 ,3 ]
Zhao, Xiaobing [1 ,3 ]
机构
[1] Minzu Univ China, Sch Informat Engn, Beijing 100081, Peoples R China
[2] Minzu Univ China, Key Lab Ethn Language Intelligent Anal & Secur Go, MOE, Beijing 100081, Peoples R China
[3] Minzu Univ China, Natl Lauguage Resource Monitoring & Res Ctr Minor, Beijing 100081, Peoples R China
关键词
Pedestrian Attribute Recognition; Multimodal Learning; Transformer;
D O I
10.1007/978-981-99-8429-9_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pedestrian attribute recognition (PAR) is susceptible to variable shooting angles, lighting, and occlusions. Improving recognition accuracy to suit its application in various complex scenarios is one of the most important tasks. In this paper, based on the Image-Text Multimodal Transformer, the intra-modal and inter-modal correlations are learned from pedestrian images and attribute labels. The applicability of six different multimodal fusion frameworks for attribute recognition is explored. The impact of different frameworks' fused feature division methods on recognition accuracy is compared and analyzed. The comparative experiments verify the robustness and efficiency of the Early Concatenate framework, which has achieved multiple best metric scores on the two major public PAR datasets, PA100k and RAP. This paper not only proposes a new Transformer-based high-accuracy multimodal network, but also provides feasible ideas and directions for further research on PAR. The comparative discussion based on various multimodal frame-works also provides a perspective that can be learned for other multimodal tasks.
引用
收藏
页码:422 / 433
页数:12
相关论文
共 50 条
  • [1] ALFormer: Attribute Localization Transformer in Pedestrian Attribute Recognition
    Liu, Yuxin
    Wang, Mingzhe
    Li, Chao
    Liu, Shuoyan
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2024, 21 (04) : 1567 - 1582
  • [2] Disentangled Attribute Features Vision Transformer for Pedestrian Attribute Recognition
    Liu, Caihua
    Guo, Jiaxian
    Chen, Sichu
    Feng, Xia
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 497 - 509
  • [3] Diverse features discovery transformer for pedestrian attribute recognition
    Zheng, Aihua
    Wang, Huimin
    Wang, Jiaxiang
    Huang, Huaibo
    He, Ran
    Hussain, Amir
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119
  • [4] Vision Transformer With Relation Exploration for Pedestrian Attribute Recognition
    Tan, Hao
    Tan, Zichang
    Weng, Dunfang
    Liu, Ajian
    Wan, Jun
    Lei, Zhen
    Li, Stan Z.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 198 - 208
  • [5] Pedestrian attribute recognition based on attribute correlation
    Zhao, Ruijie
    Lang, Congyan
    Li, Zun
    Liang, Liqian
    Wei, Lili
    Feng, Songhe
    Wang, Tao
    MULTIMEDIA SYSTEMS, 2022, 28 (03) : 1069 - 1081
  • [6] Pedestrian attribute recognition based on attribute correlation
    Ruijie Zhao
    Congyan Lang
    Zun Li
    Liqian Liang
    Lili Wei
    Songhe Feng
    Tao Wang
    Multimedia Systems, 2022, 28 : 1069 - 1081
  • [7] PARFormer: Transformer-Based Multi-Task Network for Pedestrian Attribute Recognition
    Fan, Xinwen
    Zhang, Yukang
    Lu, Yang
    Wang, Hanzi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 411 - 423
  • [8] DRFormer: Learning dual relations using Transformer for pedestrian attribute recognition
    Tang, Zengming
    Huang, Jun
    NEUROCOMPUTING, 2022, 497 : 159 - 169
  • [9] Pedestrian Attribute Recognition Based on Deep Learning
    Yuan Peipei
    Zhang Liang
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (06)
  • [10] DEEP PEDESTRIAN ATTRIBUTE RECOGNITION BASED ON LSTM
    Ji, Zhong
    Zheng, Weixiong
    Pang, Yanwei
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 151 - 155