Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection

被引:40
|
作者
Hong, Yu [1 ]
Dai, Hang [2 ]
Ding, Yong [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] MBZUAI, Abu Dhabi, U Arab Emirates
来源
关键词
POINT;
D O I
10.1007/978-3-031-20080-9_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging LiDAR-based detectors or real LiDAR point data to guide monocular 3D detection has brought significant improvement, e.g., Pseudo-LiDAR methods. However, the existing methods usually apply non-end-to-end training strategies and insufficiently leverage the LiDAR information, where the rich potential of the LiDAR data has not been well exploited. In this paper, we propose the Cross-Modality Knowledge Distillation (CMKD) network for monocular 3D detection to efficiently and directly transfer the knowledge from LiDAR modality to image modality on both features and responses. Moreover, we further extend CMKD as a semi-supervised training framework by distilling knowledge from large-scale unlabeled data and significantly boost the performance. Until submission, CMKD ranks 1st among the monocular 3D detectors with publications on both KITTI test set and Waymo val set with significant performance gains compared to previous state-of-the-art methods.
引用
收藏
页码:87 / 104
页数:18
相关论文
共 50 条
  • [31] A unified framework for cross-modality 3D model retrieval
    Hao, Tong
    Wang, Qian
    Wu, Dan
    Sun, Jin-Sheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (19) : 20217 - 20230
  • [32] A unified framework for cross-modality 3D model retrieval
    Tong Hao
    Qian Wang
    Dan Wu
    Jin-Sheng Sun
    Multimedia Tools and Applications, 2017, 76 : 20217 - 20230
  • [33] C2BG-Net: Cross-modality and cross-scale balance network with global semantics for multi-modal 3D object detection
    Ding, Bonan
    Xie, Jin
    Nie, Jing
    Wu, Yulong
    Cao, Jiale
    NEURAL NETWORKS, 2024, 179
  • [34] DGFNet: Depth-Guided Cross-Modality Fusion Network for RGB-D Salient Object Detection
    Xiao, Fen
    Pu, Zhengdong
    Chen, Jiaqi
    Gao, Xieping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2648 - 2658
  • [35] DEPTH-ASSISTED JOINT DETECTION NETWORK FOR MONOCULAR 3D OBJECT DETECTION
    Lei, Jianjun
    Guo, Tingyi
    Peng, Bo
    Yu, Chuanbo
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2204 - 2208
  • [36] Monocular 3D Object Detection From Comprehensive Feature Distillation Pseudo-LiDAR
    Sun, Chentao
    Xu, Chengrui
    Fang, Wenxiao
    Xu, Kunyuan
    IEEE ACCESS, 2023, 11 : 98969 - 98976
  • [37] Triangulation Learning Network: from Monocular to Stereo 3D Object Detection
    Qin, Zengyi
    Wang, Jinglu
    Lu, Yan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7607 - 7615
  • [38] Deep Fitting Degree Scoring Network for Monocular 3D Object Detection
    Liu, Lijie
    Lu, Jiwen
    Xu, Chunjing
    Tian, Qi
    Zhou, Jie
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1057 - 1066
  • [39] SiamSMN: Siamese Cross-Modality Fusion Network for Object Tracking
    Han, Shuo
    Gao, Lisha
    Wu, Yue
    Wei, Tian
    Wang, Manyu
    Cheng, Xu
    INFORMATION, 2024, 15 (07)
  • [40] Cross-Modality Fusion and Progressive Integration Network for Saliency Prediction on Stereoscopic 3D Images
    Mao, Yudong
    Jiang, Qiuping
    Cong, Runmin
    Gao, Wei
    Shao, Feng
    Kwong, Sam
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2435 - 2448