Spatial-frequency attention-based optical and scene flow with cross-modal knowledge distillation

被引:1
|
作者
Zhou, Youjie [1 ]
Jiao, Runyu [1 ]
Tao, Zhonghan [1 ]
Liang, Xichang [1 ]
Wan, Yi [1 ]
机构
[1] Shandong Univ, Sch Mech Engn, Jinan, Peoples R China
来源
关键词
Optical and scene flow; Multimodal fusion; Spatial-frequency domain transform; Sttention; Knowledge distillation;
D O I
10.1007/s00371-024-03654-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper studies the problem of multimodal fusion for optical and scene flow from RGB and depth images, or point clouds. Previous methods fuse multimodal information in "early-fusion" or "late-fusion" strategies, in which an attention mechanism is employed to address the problem of optical and scene flow estimation when RGB information is unreliable. Such attentive approaches either suffer from substantial computational and time complexities or lose the inherent characteristics of features due to downsampling. To address this issue, we propose a novel multimodal fusion approach named SFRAFT, which utilizes Fourier transform to build the spatial-frequency domain transformed self-attention and cross-attention. With the novel attentive mechanism, our approach can extract informative features more efficiently and effectively. We further enhance information exchange between the two modalities by incorporating multi-scale knowledge distillation. Experimental results on Flythings3D and KITTI show that our SFRAFT achieves the best performance with low computational and time complexity. We also prove the strong ability of our approach for flow estimation on our real-world dataset. We release the code and datasets at https://doi.org/10.5281/zenodo.12697968.
引用
收藏
页码:4183 / 4198
页数:16
相关论文
共 50 条
  • [31] CROSS-MODAL KNOWLEDGE DISTILLATION IN MULTI-MODAL FAKE NEWS DETECTION
    Wei, Zimian
    Pan, Hengyue
    Qiao, Linbo
    Niu, Xin
    Dong, Peijie
    Li, Dongsheng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4733 - 4737
  • [32] ERP effects of intermodal attention and cross-modal links in spatial attention
    Eimer, M
    Schröger, E
    PSYCHOPHYSIOLOGY, 1998, 35 (03) : 313 - 327
  • [33] Cross-Modal Self-Attention Distillation for Prostate Cancer Segmentation
    Zhang, Guokai
    Shen, Xiaoang
    Luo, Ye
    Luo, Jihao
    Wang, Zeju
    Wang, Weigang
    Zhao, Binghui
    Lu, Jianwei
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 909 - 914
  • [34] Lightweight dense video captioning with cross-modal attention and knowledge-enhanced unbiased scene graph
    Shixing Han
    Jin Liu
    Jinyingming Zhang
    Peizhu Gong
    Xiliang Zhang
    Huihua He
    Complex & Intelligent Systems, 2023, 9 : 4995 - 5012
  • [35] Lightweight dense video captioning with cross-modal attention and knowledge-enhanced unbiased scene graph
    Han, Shixing
    Liu, Jin
    Zhang, Jinyingming
    Gong, Peizhu
    Zhang, Xiliang
    He, Huihua
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (05) : 4995 - 5012
  • [36] Automatic depression prediction via cross-modal attention-based multi-modal fusion in social networks
    Wang, Lidong
    Zhang, Yin
    Zhou, Bin
    Cao, Shihua
    Hu, Keyong
    Tan, Yunfei
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 118
  • [37] EmotionKD: A Cross-Modal Knowledge Distillation Framework for Emotion Recognition Based on Physiological Signals
    Liu, Yucheng
    Jia, Ziyu
    Wang, Haichao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6122 - 6131
  • [38] Attention-Based Cross-Modal CNN Using Non-Disassembled Files for Malware Classification
    Kim, Jeongwoo
    Paik, Joon-Young
    Cho, Eun-Sun
    IEEE ACCESS, 2023, 11 : 22889 - 22903
  • [39] ACMFNet: Attention-Based Cross-Modal Fusion Network for Building Extraction of Remote Sensing Images
    Chen, Baiyu
    Pan, Zongxu
    Yang, Jianwei
    Long, Hui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 14
  • [40] Multilevel Attention-Based Sample Correlations for Knowledge Distillation
    Gou, Jianping
    Sun, Liyuan
    Yu, Baosheng
    Wan, Shaohua
    Ou, Weihua
    Yi, Zhang
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (05) : 7099 - 7109