MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection

被引:0
|
作者
Zheng, Shangpo [1 ]
Liu, Junfeng [1 ]
Jun, Zeng [2 ]
机构
[1] South China Univ Technol Sci & Engn, Sch Automat Sci & Engn, Guangzhou 510641, Peoples R China
[2] South China Univ Technol, Sch Elect Power Engn, Guangzhou 510641, Peoples R China
基金
中国国家自然科学基金;
关键词
Attention mechanism; cross-modality; multimodal adaptive feature fusion; multispectral object detection; transformer; PEDESTRIAN DETECTION;
D O I
10.1016/j.dsp.2025.104996
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capturing cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross- modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] MCANet: Multiscale Cross-Modality Attention Network for Multispectral Pedestrian Detection
    Wang, Xiaotian
    Zhao, Letian
    Wu, Wei
    Jin, Xi
    MULTIMEDIA MODELING, MMM 2023, PT I, 2023, 13833 : 41 - 53
  • [2] Attention-based Cross-Modality Multiscale Fusion for Multispectral Pedestrian Detection
    Hui, Zhou
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1244 - 1253
  • [3] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Fang Qingyun
    Wang Zhaokui
    PATTERN RECOGNITION, 2022, 130
  • [4] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Qingyun, Fang
    Zhaokui, Wang
    Pattern Recognition, 2022, 130
  • [5] Cross-modality complementary information fusion for multispectral pedestrian detection
    Yan, Chaoqi
    Zhang, Hong
    Li, Xuliang
    Yang, Yifan
    Yuan, Ding
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (14): : 10361 - 10386
  • [6] Cross-modality complementary information fusion for multispectral pedestrian detection
    Chaoqi Yan
    Hong Zhang
    Xuliang Li
    Yifan Yang
    Ding Yuan
    Neural Computing and Applications, 2023, 35 : 10361 - 10386
  • [7] Cascaded Cross-Modality Fusion Network for 3D Object Detection
    Chen, Zhiyu
    Lin, Qiong
    Sun, Jing
    Feng, Yujian
    Liu, Shangdong
    Liu, Qiang
    Ji, Yimu
    Xu, He
    SENSORS, 2020, 20 (24) : 1 - 14
  • [8] Cross-modality interactive attention network for multispectral pedestrian detection
    Zhang, Lu
    Liu, Zhiyong
    Zhang, Shifeng
    Yang, Xu
    Qiao, Hong
    Huang, Kaizhu
    Hussain, Amir
    INFORMATION FUSION, 2019, 50 : 20 - 29
  • [9] SiamSMN: Siamese Cross-Modality Fusion Network for Object Tracking
    Han, Shuo
    Gao, Lisha
    Wu, Yue
    Wei, Tian
    Wang, Manyu
    Cheng, Xu
    INFORMATION, 2024, 15 (07)
  • [10] A MULTISPECTRAL-INFRARED OBJECT DETECTION METHOD BASED ON CROSS-MODALITY IMAGE FEATURE FILTERING FUSION
    Liu, Ze
    Su, Nan
    Zhao, Chunhui
    Yan, Yiming
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6823 - 6825