PiSFANet: Pillar Scale-Aware Feature Aggregation Network for Real-Time 3D Pedestrian Detection

被引:0
|
作者
Yan, Weiqing [1 ]
Liu, Shile [1 ]
Tang, Chang [2 ]
Zhou, Wujie [3 ]
机构
[1] Yantai Univ, Sch Comp & Control Engn, Yantai 261400, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[3] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Zhejiang 310023, Peoples R China
关键词
Feature extraction; Pedestrians; Three-dimensional displays; Point cloud compression; Encoding; Real-time systems; Object detection; 3D object detection; real-time; scale-aware; pillar-based;
D O I
10.1109/LSP.2024.3426294
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Detecting 3D pedestrian from point cloud data in real-time while accounting for scale is crucial in various robotic and autonomous driving applications. Currently, the most successful methods for 3D object detection rely on voxel-based techniques, but these tend to be computationally inefficient for deployment in aerial scenarios. Conversely, the pillar-based approach exclusively employs 2D convolution, requiring fewer computational resources, albeit potentially sacrificing detection accuracy compared to voxel-based methods. Previous pillar-based approaches suffered from inadequate pillar feature encoding. In this letter, we introduce a real-time and scale-aware 3D Pedestrian Detection, which incorporates a robust encoder network designed for effective pillar feature extraction. The Proposed TriFocus Attention module (TriFA), which integrates external attention and similar attention strategies based on Squeeze and Exception. By comprehensively supervising the point-wise, channel-wise, and pillar-wise of pillar features, it enhances the encoding ability of pillars, suppresses noise in pillar features, and enhances the expression ability of pillar features. The proposed Bidirectional Scale-Aware Feature Pyramid module (BiSAFP) integrates a scale-aware module into the multi-scale pyramid structure. This addition enhances its ability to perceive pedestrian within low-level features. Moreover, it ensures that the significance of feature maps across various feature levels is fully taken into account. BiSAFP represents a lightweight multi-scale pyramid network that minimally impacts inference time while substantially boosting network performance. Our approach achieves real-time detection, processing up to 30 frames per second (FPS).
引用
收藏
页码:2000 / 2004
页数:5
相关论文
共 50 条
  • [31] SAR: Scale-Aware Restoration Learning for 3D Tumor Segmentation
    Zhang, Xiaoman
    Feng, Shixiang
    Zhou, Yuhang
    Zhang, Ya
    Wang, Yanfeng
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 124 - 133
  • [32] SACINet: Semantic-Aware Cross-Modal Interaction Network for Real-Time 3D Object Detection
    Yang, Ying
    Yin, Hui
    Chong, Ai-Xin
    Wan, Jin
    Liu, Qing-Yi
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3917 - 3927
  • [33] PillarNet: Real-Time and High-Performance Pillar-Based 3D Object Detection
    Shi, Guangsheng
    Li, Ruifeng
    Ma, Chao
    COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 35 - 52
  • [34] Efficient 3D Feature Learning for Real-Time Awareness
    Cheng, Ta-Ying
    2023 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING, SMARTCOMP, 2023, : 237 - 238
  • [35] Real-Time 3D Change Detection of IEDs
    Wathen, Mitch
    Link, Norah
    Iles, Peter
    Jinkerson, John
    Mrstik, Paul
    Kusevic, Kresimir
    Kovats, David
    LASER RADAR TECHNOLOGY AND APPLICATIONS XVII, 2012, 8379
  • [36] Real-time pedestrian classification exploiting 2D and 3D information
    Corneliu, T.
    Nedevschi, S.
    IET INTELLIGENT TRANSPORT SYSTEMS, 2008, 2 (03) : 201 - 210
  • [37] PARTIAL FEATURE AGGREGATION NETWORK FOR REAL-TIME OBJECT COUNTING
    Yu, Houshun
    Zhang, Li
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2405 - 2409
  • [38] Real-time 3D multi-pedestrian detection and tracking using 3D LiDAR point cloud for mobile robot
    Na, Ki-In
    Park, Byungjae
    ETRI JOURNAL, 2023, 45 (05) : 836 - 846
  • [39] SSD-MonoDETR: Supervised Scale-Aware Deformable Transformer for Monocular 3D Object Detection
    He, Xuan
    Yang, Fan
    Yang, Kailun
    Lin, Jiacheng
    Fu, Haolong
    Wang, Meng
    Yuan, Jin
    Li, Zhiyong
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 555 - 567
  • [40] A PIPELINE HOG FEATURE EXTRACTION FOR REAL-TIME PEDESTRIAN DETECTION ON FPGA
    Vinh Ngo
    Casadevall, Arnau
    Codina, Marc
    Castells-Rufas, David
    Carrabina, Jordi
    2017 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2017,