MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection

被引:0
|
作者
Zheng, Shangpo [1 ]
Liu, Junfeng [1 ]
Jun, Zeng [2 ]
机构
[1] South China Univ Technol Sci & Engn, Sch Automat Sci & Engn, Guangzhou 510641, Peoples R China
[2] South China Univ Technol, Sch Elect Power Engn, Guangzhou 510641, Peoples R China
基金
中国国家自然科学基金;
关键词
Attention mechanism; cross-modality; multimodal adaptive feature fusion; multispectral object detection; transformer; PEDESTRIAN DETECTION;
D O I
10.1016/j.dsp.2025.104996
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capturing cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross- modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Attention-based Cross-modality Interaction for Multispectral Pedestrian Detection
    Liu, Tianshan
    Zhao, Rui
    Lam, Kin-Man
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2021, 2021, 11766
  • [32] Background-Aware Cross-Attention Multiscale Fusion for Multispectral Object Detection
    Guo, Runze
    Guo, Xiaojun
    Sun, Xiaoyong
    Zhou, Peida
    Sun, Bei
    Su, Shaojing
    REMOTE SENSING, 2024, 16 (21)
  • [33] Multidimensional Fusion Network for Multispectral Object Detection
    Yang, Fan
    Liang, Binbin
    Li, Wei
    Zhang, Jianwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 547 - 560
  • [34] Deep adaptive fusion with cross-modality feature transition and modality quaternion learning for medical image fusion
    Srivastava, Somya
    Bhatia, Shaveta
    Agrawal, Arun Prakash
    Jayswal, Anant Kumar
    Godara, Jyoti
    Dubey, Gaurav
    EVOLVING SYSTEMS, 2025, 16 (01)
  • [35] Double cross-modality progressively guided network for RGB-D salient object detection
    Yao, Cuili
    Feng, Lin
    Kong, Yuqiu
    Li, Shengming
    Li, Hang
    IMAGE AND VISION COMPUTING, 2022, 117
  • [36] Attention-Based Cross-Modality Feature Complementation for Multispectral Pedestrian Detection
    Jiang, Qunyan
    Dai, Juying
    Rui, Ting
    Shao, Faming
    Wang, Jinkang
    Lu, Guanlin
    IEEE ACCESS, 2022, 10 : 53797 - 53809
  • [37] Multi-Task Cross-Modality Attention-Fusion for 2D Object Detection
    Sun, Huawei
    Feng, Hao
    Stettinger, Georg
    Servadei, Lorenzo
    Wille, Robert
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 3619 - 3626
  • [38] MCAFNet: multiscale cross-layer attention fusion network for honeycomb lung lesion segmentation
    Gang Li
    Jinjie Xie
    Ling Zhang
    Mengxia Sun
    Zhichao Li
    Yuanjin Sun
    Medical & Biological Engineering & Computing, 2024, 62 : 1121 - 1137
  • [39] Cross-Modality Data Augmentation for Aerial Object Detection with Representation Learning
    Wei, Chiheng
    Bai, Lianfa
    Chen, Xiaoyu
    Han, Jing
    REMOTE SENSING, 2024, 16 (24)
  • [40] MCAFNet: multiscale cross-layer attention fusion network for honeycomb lung lesion segmentation
    Li, Gang
    Xie, Jinjie
    Zhang, Ling
    Sun, Mengxia
    Li, Zhichao
    Sun, Yuanjin
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2024, 62 (04) : 1121 - 1137