A multilevel fusion network for 3D object detection

被引:5
|
作者
Xia, Chunlong [1 ]
Wei, Ping [1 ]
Wei, Wenwen [1 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
3D object detection; Multilevel fusion; Neural networks; SEGMENTATION;
D O I
10.1016/j.neucom.2021.01.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is an important yet challenging problem in a myriad of vision, robotics, and human- machine interaction applications. Given an RGB-D image, the task is to infer the class labels and the 3D bounding boxes of the objects in the image. While the previous studies have made remarkable progress over the past decade, how to effectively exploit the feature fusion with neural networks for boosting 3D object detection performance remains an open problem. This paper proposes a multilevel fusion network (MFN) model to detect 3D objects in RGB-D images. The MFN model contains two streams of neural networks which respectively extracts the RGB and depth features with cascaded convolutional modules. To effectively exploit the information of 3D objects, a multilevel fusion mechanism is adopted to fuse the convolutional RGB and depth features at multiple levels. To train the network, we propose a new weighted loss function by encoding the difference of geometric attributes on 3D bounding box regression. Since the original depth data is full of noisy holes, we also develop an adaptive filtering algorithm to restore and correct the depth images. We test the proposed model on challenging RGB-D datasets. The experimental results on the datasets prove the strength and advantage of the proposed model. (c) 2021 Elsevier B.V. All rights reserved. 3D object detection is an important yet challenging problem in a myriad of vision, robotics, and human? machine interaction applications. Given an RGB-D image, the task is to infer the class labels and the 3D bounding boxes of the objects in the image. While the previous studies have made remarkable progress over the past decade, how to effectively exploit the feature fusion with neural networks for boosting 3D object detection performance remains an open problem. This paper proposes a multilevel fusion network (MFN) model to detect 3D objects in RGB-D images. The MFN model contains two streams of neural networks which respectively extracts the RGB and depth features with cascaded convolutional modules. To effectively exploit the information of 3D objects, a multilevel fusion mechanism is adopted to fuse the convolutional RGB and depth features at multiple levels. To train the network, we propose a new weighted loss function by encoding the difference of geometric attributes on 3D bounding box regression. Since the original depth data is full of noisy holes, we also develop an adaptive filtering algorithm to restore and correct the depth images. We test the proposed model on challenging RGB-D datasets. The experimental results on the datasets prove the strength and advantage of the proposed model.
引用
收藏
页码:107 / 117
页数:11
相关论文
共 50 条
  • [1] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    Neurocomputing, 2021, 437 : 107 - 117
  • [2] SGFNet: Segmentation Guided Fusion Network for 3D Object Detection
    Wang, Yunlong
    Jiang, Kun
    Wen, Tuopu
    Jiao, Xinyu
    Wijaya, Benny
    Miao, Jinyu
    Shi, Yining
    Fu, Zheng
    Yang, Mengmeng
    Yang, Diange
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (12) : 8239 - 8246
  • [3] Bilateral-Branch Fusion Network for 3D Object Detection
    Chen, Zhiyu
    Feng, Yujian
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 4321 - 4325
  • [4] MF-Net: Meta Fusion Network for 3D object detection
    Meng, Zhaoxin
    Luo, Guiyang
    Yuan, Quan
    Li, Jinglin
    Yang, Fangchun
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [5] Multimodal fusion via voting network for 3D object detection in indoors
    Li, Jianxin
    Si, Guannan
    Liang, Xinyu
    An, Zhaoliang
    Tian, Pengxin
    Zhou, Fengyu
    Wang, Xiaoliang
    PATTERN RECOGNITION, 2025, 164
  • [6] Cascaded Cross-Modality Fusion Network for 3D Object Detection
    Chen, Zhiyu
    Lin, Qiong
    Sun, Jing
    Feng, Yujian
    Liu, Shangdong
    Liu, Qiang
    Ji, Yimu
    Xu, He
    SENSORS, 2020, 20 (24) : 1 - 14
  • [7] FusionPillars: A 3D Object Detection Network with Cross-Fusion and Self-Fusion
    Zhang, Jing
    Xu, Da
    Li, Yunsong
    Zhao, Liping
    Su, Rui
    REMOTE SENSING, 2023, 15 (10)
  • [8] SGF3D: Similarity-guided fusion network for 3D object detection
    Li, Chunzheng
    Wang, Gaihua
    Long, Qian
    Zhou, Zhengshu
    IMAGE AND VISION COMPUTING, 2024, 142
  • [9] BMFN3D: Bidirectional multilayer fusion network for indoor 3D object detection
    Cheng, Jun
    Zhang, Sheng
    ELECTRONICS LETTERS, 2022, 58 (18) : 696 - 698
  • [10] SGF3D: Similarity-guided fusion network for 3D object detection
    Li, Chunzheng
    Wang, Gaihua
    Long, Qian
    Zhou, Zhengshu
    Image and Vision Computing, 2024, 142