Multi-Modal Pedestrian Crossing Intention Prediction with Transformer-Based Model

被引:0
|
作者
Wang, Ting-Wei [1 ]
Lai, Shang-Hong [1 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu, Taiwan
关键词
Pedestrian crossing intention prediction; multi-modal learning; transformer model; human posture;
D O I
10.1561/116.20240019
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Pedestrian crossing intention prediction based on computer vision plays a pivotal role in enhancing the safety of autonomous driving and advanced driver assistance systems. In this paper, we present a novel multi-modal pedestrian crossing intention prediction framework leveraging the transformer model. By integrating diverse sources of information and leveraging the transformer's sequential modeling and parallelization capabilities, our system accurately predicts pedestrian crossing intentions. We introduce a novel representation of traffic environment data and incorporate lifted 3D human pose and head orientation data to enhance the model's understanding of pedestrian behavior. Experimental results demonstrate the state-of-the-art accuracy of our proposed system on benchmark datasets.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Personalized emotion analysis based on fuzzy multi-modal transformer model
    Liu, Jianbang
    Ang, Mei Choo
    Chaw, Jun Kit
    Ng, Kok Weng
    Kor, Ah-Lian
    APPLIED INTELLIGENCE, 2025, 55 (03)
  • [42] Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU
    Wanchaitanawong, Napat
    Tanaka, Masayuki
    Shibata, Takashi
    Okutomi, Masatoshi
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [43] An Improved Transformer-Based Model for Urban Pedestrian Detection
    Wu, Tianyong
    Li, Xiang
    Dong, Qiuxuan
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2025, 18 (01)
  • [44] MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction
    Ham, Je-Seok
    Bae, Kangmin
    Moon, Jinyoung
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 13801 LNCS : 663 - 679
  • [45] Dynamical User Intention Prediction via Multi-modal Learning
    Liu, Xuanwu
    Li, Zhao
    Mao, Yuanhui
    Lai, Lixiang
    Gao, Ben
    Deng, Yao
    Yu, Guoxian
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 519 - 535
  • [46] Research on Pedestrian Crossing Intention Prediction Based on Deep Learning
    Huo, Chunbao
    Ma, Jie
    Tong, Zhibo
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 282 - 287
  • [47] A Vision Transformer-Based Framework for Knowledge Transfer From Multi-Modal to Mono-Modal Lymphoma Subtyping Models
    Guetarni, Bilel
    Windal, Feryal
    Benhabiles, Halim
    Petit, Marianne
    Dubois, Romain
    Leteurtre, Emmanuelle
    Collard, Dominique
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (09) : 5562 - 5572
  • [48] Multi-modal transformer with language modality distillation for early pedestrian action anticipation
    Osman, Nada
    Camporese, Guglielmo
    Ballan, Lamberto
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [49] Transformer-Based Multi-Modal Data Fusion Method for COPD Classification and Physiological and Biochemical Indicators Identification
    Xie, Weidong
    Fang, Yushan
    Yang, Guicheng
    Yu, Kun
    Li, Wei
    BIOMOLECULES, 2023, 13 (09)
  • [50] MulCPred: Learning Multi-Modal Concepts for Explainable Pedestrian Action Prediction
    Feng, Yan
    Carballo, Alexander
    Fujii, Keisuke
    Karlsson, Robin
    Ding, Ming
    Takeda, Kazuya
    SENSORS, 2024, 24 (20)