CDAF3D: Cross-Dimensional Attention Fusion for Indoor 3D Object Detection

被引:0
|
作者
Wang, Shilin [1 ]
Huang, Hai [1 ]
Zhu, Yueyan [1 ]
Tang, Zhenqi [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100876, Peoples R China
基金
国家重点研发计划;
关键词
Indoor 3D Object Detection; Fusion Features; Point Cloud;
D O I
10.1007/978-981-97-8493-6_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is a crucial task in computer vision and autonomous systems, which is widely utilized in robotics, autonomous driving, and augmented reality. With the advancement of input devices, researchers propose to use multimodal information to improve the detection accuracy. However, integrating 2D and 3D features effectively to harness their complementary nature for detection tasks is still a challenge. In this paper, we note that the complementary nature of geometric and visual texture information can effectively strengthen feature fusion, which plays a key role in detection. To this end, we propose the Cross-Dimensional Attention Fusion-based indoor 3D object detection method (CDAF3D). This method dynamically learns geometric information with corresponding 2D image texture details through a cross-dimensional attention mechanism, enabling the model to capture and integrate spatial and textural information effectively. Additionally, due to the nature of 3D object detection, where intersecting entities with different specific labels are unrealistic, we further propose Preventive 3D Intersect Loss (P3DIL). This loss enhances detection accuracy by addressing intersections between objects of different labels. We evaluate the proposed CDAF3D on the SUN RGB-D and Scannet v2 datasets. Our results achieve 78.2 mAP@0.25 and 66.5 mAP@0.50 on ScanNetV2 and 70.3 mAP@0.25 and 54.1 mAP@0.50 on SUN RGB-D. The proposed CDAF3D outperforms all the multi-sensor-based methods with 3D IoU thresholds of 0.25 and 0.5.
引用
收藏
页码:165 / 177
页数:13
相关论文
共 50 条
  • [1] DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion
    Bi, Jiangfeng
    Wei, Haiyue
    Zhang, Guoxin
    Yang, Kuihe
    Song, Ziying
    IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (02) : 106 - 112
  • [2] Image attention transformer network for indoor 3D object detection
    Ren, Keyan
    Yan, Tong
    Hu, Zhaoxin
    Han, Honggui
    Zhang, Yunlu
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (07) : 2176 - 2190
  • [3] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, (07) : 2176 - 2190
  • [4] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, 67 (07) : 2176 - 2190
  • [5] BMFN3D: Bidirectional multilayer fusion network for indoor 3D object detection
    Cheng, Jun
    Zhang, Sheng
    ELECTRONICS LETTERS, 2022, 58 (18) : 696 - 698
  • [6] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [7] ARM3D: Attention-based relation module for indoor 3D object detection
    Lan, Yuqing
    Duan, Yao
    Liu, Chenyi
    Zhu, Chenyang
    Xiong, Yueshan
    Huang, Hui
    Xu, Kai
    COMPUTATIONAL VISUAL MEDIA, 2022, 8 (03) : 395 - 414
  • [8] ARM3D: Attention-based relation module for indoor 3D object detection
    Yuqing Lan
    Yao Duan
    Chenyi Liu
    Chenyang Zhu
    Yueshan Xiong
    Hui Huang
    Kai Xu
    ComputationalVisualMedia, 2022, 8 (03) : 395 - 414
  • [9] ARM3D: Attention-based relation module for indoor 3D object detection
    Yuqing Lan
    Yao Duan
    Chenyi Liu
    Chenyang Zhu
    Yueshan Xiong
    Hui Huang
    Kai Xu
    Computational Visual Media, 2022, 8 : 395 - 414
  • [10] AEPF: Attention-Enabled Point Fusion for 3D Object Detection
    Sharma, Sachin
    Meyer, Richard T.
    Asher, Zachary D.
    SENSORS, 2024, 24 (17)