Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection

被引:40
|
作者
Hong, Yu [1 ]
Dai, Hang [2 ]
Ding, Yong [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] MBZUAI, Abu Dhabi, U Arab Emirates
来源
关键词
POINT;
D O I
10.1007/978-3-031-20080-9_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging LiDAR-based detectors or real LiDAR point data to guide monocular 3D detection has brought significant improvement, e.g., Pseudo-LiDAR methods. However, the existing methods usually apply non-end-to-end training strategies and insufficiently leverage the LiDAR information, where the rich potential of the LiDAR data has not been well exploited. In this paper, we propose the Cross-Modality Knowledge Distillation (CMKD) network for monocular 3D detection to efficiently and directly transfer the knowledge from LiDAR modality to image modality on both features and responses. Moreover, we further extend CMKD as a semi-supervised training framework by distilling knowledge from large-scale unlabeled data and significantly boost the performance. Until submission, CMKD ranks 1st among the monocular 3D detectors with publications on both KITTI test set and Waymo val set with significant performance gains compared to previous state-of-the-art methods.
引用
收藏
页码:87 / 104
页数:18
相关论文
共 50 条
  • [1] Selective Transfer Learning of Cross-Modality Distillation for Monocular 3D Object Detection
    Ding, Rui
    Yang, Meng
    Zheng, Nanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9925 - 9938
  • [2] Cross-Modality 3D Object Detection
    Zhu, Ming
    Ma, Chao
    Ji, Pan
    Yang, Xiaokang
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3771 - 3780
  • [3] Cascaded Cross-Modality Fusion Network for 3D Object Detection
    Chen, Zhiyu
    Lin, Qiong
    Sun, Jing
    Feng, Yujian
    Liu, Shangdong
    Liu, Qiang
    Ji, Yimu
    Xu, He
    SENSORS, 2020, 20 (24) : 1 - 14
  • [4] UniDistill: A Universal Cross-Modality Knowledge Distillation Framework for 3D Object Detection in Bird's-Eye View
    Zhou, Shengchao
    Liu, Weizhou
    Hu, Chen
    Zhou, Shuchang
    Ma, Chao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5116 - 5125
  • [5] A Two-Phase Cross-Modality Fusion Network for Robust 3D Object Detection
    Jiao, Yujun
    Yin, Zhishuai
    SENSORS, 2020, 20 (21) : 1 - 14
  • [6] CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation
    Zhao, Lingjun
    Song, Jingyu
    Skinner, Katherine A.
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 15470 - 15480
  • [7] Monocular 3D Object Detection With Motion Feature Distillation
    Hu, Henan
    Li, Muyu
    Zhu, Ming
    Gao, Wen
    Liu, Peiyu
    Chan, Kwok-Leung
    IEEE ACCESS, 2023, 11 : 82933 - 82945
  • [8] Dynamic Knowledge Distillation with Cross-Modality Knowledge Transfer
    Wang, Guangzhi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2974 - 2978
  • [9] Unconstrained Monocular 3D Human Pose Estimation by Action Detection and Cross-modality Regression Forest
    Yu, Tsz-Ho
    Kim, Tae-Kyun
    Cipolla, Roberto
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3642 - 3649
  • [10] DCMNet: Discriminant and cross-modality network for RGB-D salient object detection
    Wang, Fasheng
    Wang, Ruimin
    Sun, Fuming
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214