Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

被引:0
|
作者
Yu, Haibao [1 ,2 ]
Tang, Yingjuan [2 ,3 ]
Xie, Enze [1 ]
Mao, Jilei [2 ]
Luo, Ping [1 ,4 ]
Nie, Zaiqing [2 ,5 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Tsinghua Univ, Inst AI Ind Res AIR, Beijing, Peoples R China
[3] Beijing Inst Technol, Beijing, Peoples R China
[4] Shanghai AI Lab, Shanghai, Peoples R China
[5] AIR, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cooperatively utilizing both ego-vehicle and infrastructure sensor data can significantly enhance autonomous driving perception abilities. However, the uncertain temporal asynchrony and limited communication conditions can lead to fusion misalignment and constrain the exploitation of infrastructure data. To address these issues in vehicle-infrastructure cooperative 3D (VIC3D) object detection, we propose the Feature Flow Net (FFNet), a novel cooperative detection framework. FFNet is a flow-based feature fusion framework that uses a feature flow prediction module to predict future features and compensate for asynchrony. Instead of transmitting feature maps extracted from still-images, FFNet transmits feature flow, leveraging the temporal coherence of sequential infrastructure frames. Furthermore, we introduce a self-supervised training approach that enables FFNet to generate feature flow with feature prediction ability from raw infrastructure sequences. Experimental results demonstrate that our proposed method outperforms existing cooperative detection methods while only requiring about 1/100 of the transmission cost of raw data and covers all latency in one model on the DAIR-V2X dataset. The code is available at https://github.com/haibao-yu/FFNet-VIC3D.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Design and Implementation of an Emergency Vehicle Signal Preemption System Based on Cooperative Vehicle-Infrastructure Technology
    Wang, Yinsong
    Wu, Zhizhou
    Yang, Xiaoguang
    Huang, Luoyi
    ADVANCES IN MECHANICAL ENGINEERING, 2013,
  • [42] Cross-Domain Generalization for LiDAR-Based 3D Object Detection in Infrastructure and Vehicle Environments
    Zhi, Peng
    Jiang, Longhao
    Yang, Xiao
    Wang, Xingzheng
    Li, Hung-Wei
    Zhou, Qingguo
    Li, Kuan-Ching
    Ivanovic, Mirjana
    SENSORS, 2025, 25 (03)
  • [43] VI-BEV: Vehicle-Infrastructure Collaborative Perception for 3-D Object Detection on Bird's-Eye View
    Meng, Jingxiong
    Zhao, Junfeng
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 6 : 256 - 265
  • [44] Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
    Li, Xin
    Shi, Botian
    Hou, Yuenan
    Wu, Xingjiao
    Ma, Tianlong
    Li, Yikang
    He, Liang
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 691 - 707
  • [45] MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
    Shi, Peicheng
    Liu, Zhiqiang
    Qi, Heng
    Yang, Aixi
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (03): : 5615 - 5637
  • [46] Multi-modal feature fusion for 3D object detection in the production workshop
    Hou, Rui
    Chen, Guangzhu
    Han, Yinhe
    Tang, Zaizuo
    Ru, Qingjun
    APPLIED SOFT COMPUTING, 2022, 115
  • [47] DMFF: dual-way multimodal feature fusion for 3D object detection
    Dong, Xiaopeng
    Di, Xiaoguang
    Wang, Wenzhuang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 455 - 463
  • [48] Deformable Feature Fusion Network for Multi-Modal 3D Object Detection
    Guo, Kun
    Gan, Tong
    Ding, Zhao
    Ling, Qiang
    2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024, 2024, : 363 - 367
  • [49] Multimodal feature adaptive fusion for anchor-free 3D object detection
    Wu, Yanli
    Wang, Junyin
    Li, Hui
    Ai, Xiaoxue
    Li, Xiao
    APPLIED INTELLIGENCE, 2025, 55 (07)
  • [50] DMFF: dual-way multimodal feature fusion for 3D object detection
    Xiaopeng Dong
    Xiaoguang Di
    Wenzhuang Wang
    Signal, Image and Video Processing, 2024, 18 (1) : 455 - 463