FCNet: Stereo 3D Object Detection with Feature Correlation Networks

被引:3
|
作者
Wu, Yingyu [1 ]
Liu, Ziyan [1 ,2 ,3 ]
Chen, Yunlei [1 ]
Zheng, Xuhui [1 ]
Zhang, Qian [1 ]
Yang, Mo [1 ]
Tang, Guangming [3 ]
机构
[1] Guizhou Univ, Coll Big Data & Informat Engn, Guiyang 550025, Peoples R China
[2] Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
关键词
3D object detection; deep learning; stereo matching; multi-scale cost-volume; channel similarity; parallel convolutional attention;
D O I
10.3390/e24081121
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Deep-learning techniques have significantly improved object detection performance, especially with binocular images in 3D scenarios. To supervise the depth information in stereo 3D object detection, reconstructing the 3D dense depth of LiDAR point clouds causes higher computational costs and lower inference speed. After exploring the intrinsic relationship between the implicit depth information and semantic texture features of the binocular images, we propose an efficient and accurate 3D object detection algorithm, FCNet, in stereo images. First, we construct a multi-scale cost-volume containing implicit depth information using the normalized dot-product by generating multi-scale feature maps from the input stereo images. Secondly, the variant attention model enhances its global and local description, and the sparse region monitors the depth loss deep regression. Thirdly, for balancing the channel information preservation of the re-fused left-right feature maps and computational burden, a reweighting strategy is employed to enhance the feature correlation in merging the last-layer features of binocular images. Extensive experiment results on the challenging KITTI benchmark demonstrate that the proposed algorithm achieves better performance, including a lower computational cost and higher inference speed in 3D object detection.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Silhouette and stereo fusion for 3D object modeling
    Esteban, CH
    Schmitt, F
    FOURTH INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2003, : 46 - 53
  • [42] Stereo R-CNN based 3D Object Detection for Autonomous Driving
    Li, Peiliang
    Chen, Xiaozhi
    Shen, Shaojie
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7636 - 7644
  • [43] Correlation Field for Boosting 3D Object Detection in Structured Scenes
    Sun, Jianhua
    Fang, Hao-Shu
    Zhu, Xianghui
    Li, Jiefeng
    Lu, Cewu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2298 - 2306
  • [44] ETS-3D: An Efficient Two-Stage Framework for Stereo 3D Object Detection
    Ji, Chaofeng
    Liu, Guizhong
    Zhao, Dan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 88
  • [45] DSC3D: Deformable Sampling Constraints in Stereo 3D Object Detection for Autonomous Driving
    Chen, Jiawei
    Song, Qi
    Guo, Wenzhong
    Huang, Rui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2794 - 2805
  • [46] Multi-feature enhancement based on sparse networks for single-stage 3D object detection
    Ke, Zunwang
    Lin, Chenyu
    Zhang, Tao
    Jia, Tingting
    Du, Minghua
    Wang, Gang
    Zhang, Yugui
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 111 : 123 - 135
  • [47] DeepSDP: A Real-Time Deep Stereo Detection and Positioning Method for 3D Object Detection
    Moradi, Homayoun
    Karami, Mohammad
    Shamaghdari, Saeed
    2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 1309 - 1313
  • [48] MVTr: multi-feature voxel transformer for 3D object detection
    Ai, Lingmei
    Xie, Zhuoyu
    Yao, Ruoxia
    Yang, Mengyao
    VISUAL COMPUTER, 2024, 40 (03): : 1453 - 1466
  • [49] MVTr: multi-feature voxel transformer for 3D object detection
    Lingmei Ai
    Zhuoyu Xie
    Ruoxia Yao
    Mengyao Yang
    The Visual Computer, 2024, 40 : 1453 - 1466
  • [50] PillarFocusNet for 3D object detection with perceptual diffusion and key feature understanding
    Yuhan Gao
    Peng Wang
    Xiaoyan Li
    Bo Sun
    Mengyu Sun
    Liangliang Li
    Ruohai Di
    Scientific Reports, 15 (1)