A multilevel fusion network for 3D object detection

被引:5
|
作者
Xia, Chunlong [1 ]
Wei, Ping [1 ]
Wei, Wenwen [1 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
3D object detection; Multilevel fusion; Neural networks; SEGMENTATION;
D O I
10.1016/j.neucom.2021.01.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is an important yet challenging problem in a myriad of vision, robotics, and human- machine interaction applications. Given an RGB-D image, the task is to infer the class labels and the 3D bounding boxes of the objects in the image. While the previous studies have made remarkable progress over the past decade, how to effectively exploit the feature fusion with neural networks for boosting 3D object detection performance remains an open problem. This paper proposes a multilevel fusion network (MFN) model to detect 3D objects in RGB-D images. The MFN model contains two streams of neural networks which respectively extracts the RGB and depth features with cascaded convolutional modules. To effectively exploit the information of 3D objects, a multilevel fusion mechanism is adopted to fuse the convolutional RGB and depth features at multiple levels. To train the network, we propose a new weighted loss function by encoding the difference of geometric attributes on 3D bounding box regression. Since the original depth data is full of noisy holes, we also develop an adaptive filtering algorithm to restore and correct the depth images. We test the proposed model on challenging RGB-D datasets. The experimental results on the datasets prove the strength and advantage of the proposed model. (c) 2021 Elsevier B.V. All rights reserved. 3D object detection is an important yet challenging problem in a myriad of vision, robotics, and human? machine interaction applications. Given an RGB-D image, the task is to infer the class labels and the 3D bounding boxes of the objects in the image. While the previous studies have made remarkable progress over the past decade, how to effectively exploit the feature fusion with neural networks for boosting 3D object detection performance remains an open problem. This paper proposes a multilevel fusion network (MFN) model to detect 3D objects in RGB-D images. The MFN model contains two streams of neural networks which respectively extracts the RGB and depth features with cascaded convolutional modules. To effectively exploit the information of 3D objects, a multilevel fusion mechanism is adopted to fuse the convolutional RGB and depth features at multiple levels. To train the network, we propose a new weighted loss function by encoding the difference of geometric attributes on 3D bounding box regression. Since the original depth data is full of noisy holes, we also develop an adaptive filtering algorithm to restore and correct the depth images. We test the proposed model on challenging RGB-D datasets. The experimental results on the datasets prove the strength and advantage of the proposed model.
引用
收藏
页码:107 / 117
页数:11
相关论文
共 50 条
  • [31] Adaptive and azimuth-aware fusion network of multimodal local features for 3D object detection
    Tian, Yonglin
    Wang, Kunfeng
    Wang, Yuang
    Tian, Yulin
    Wang, Zilei
    Wang, Fei-Yue
    NEUROCOMPUTING, 2020, 411 : 32 - 44
  • [32] A Two-Phase Cross-Modality Fusion Network for Robust 3D Object Detection
    Jiao, Yujun
    Yin, Zhishuai
    SENSORS, 2020, 20 (21) : 1 - 14
  • [33] A GEOMETRIC CONVOLUTIONAL NEURAL NETWORK FOR 3D OBJECT DETECTION
    Lu, Yawen
    Guo, Qianyu
    Lu, Guoyu
    2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
  • [34] A New Monocular 3D Object Detection with Neural Network
    Hong, Weijie
    Liu, Yiguang
    Zheng, Yunan
    Wang, Ying
    Shi, Xuelei
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT IV, 2018, 11259 : 174 - 185
  • [35] VENet: Voting Enhancement Network for 3D Object Detection
    Xie, Qian
    Lai, Yu-Kun
    Wu, Jing
    Wang, Zhoutao
    Lu, Dening
    Wei, Mingqiang
    Wang, Jun
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3692 - 3701
  • [36] Multi-feature Fusion VoteNet for 3D Object Detection
    Wang, Zhoutao
    Xie, Qian
    Wei, Mingqiang
    Long, Kun
    Wang, Jun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (01)
  • [37] Sensor Fusion for Joint 3D Object Detection and Semantic Segmentation
    Meyer, Gregory P.
    Charland, Jake
    Hegde, Darshan
    Laddha, Ankit
    Vallespi-Gonzalez, Carlos
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 1230 - 1237
  • [38] A LiDAR-Camera Fusion 3D Object Detection Algorithm
    Liu, Leyuan
    He, Jian
    Ren, Keyan
    Xiao, Zhonghua
    Hou, Yibin
    INFORMATION, 2022, 13 (04)
  • [39] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [40] Frame Fusion with Vehicle Motion Prediction for 3D Object Detection
    Li, Xirui
    Wang, Feng
    Wang, Naiyan
    Ma, Chao
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4252 - 4258