A transformer-CNN parallel network for image guided depth completion

被引:5
|
作者
Li, Tao [1 ]
Dong, Xiucheng [1 ]
Lin, Jie [2 ]
Peng, Yonghong [3 ]
机构
[1] Xihua Univ, Sch Elect Engn & Elect Informat, Chengdu 610039, Peoples R China
[2] Xihua Univ, Sch Aeronaut & Astronaut, Chengdu 610039, Peoples R China
[3] Manchester Metropolitan Univ, Dept Comp & Math, Manchester M1 5GD, England
基金
中国国家自然科学基金;
关键词
Depth completion; Convolutional neural network; Transformer; Token correlation; Conditional random field;
D O I
10.1016/j.patcog.2024.110305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image guided depth completion aims to predict a dense depth map from sparse depth measurements and the corresponding single color image. However, most state-of-the-art methods only rely on convolutional neural network (CNN) or transformer. In this paper, we propose a transformer -CNN parallel network (TCPNet) to integrate the advantages of CNN in local detail recovery and transformer in long-range semantic modeling. Specifically, our CNN branch adopts dense connection to strengthen feature propagation. Since the common transformer computes self -attention based on all the tokens in the window, no matter if they are relevant or not, this will inevitably introduce interferences and noises. To improve the self -attention accuracy, we propose a correlation -based transformer to only allow nearest neighbor tokens to participate in the self -attention computation. We also design a multi -scale conditional random field (CRF) module to implement multi -scale high -dimensional filtering for depth refinement. The comprehensive experimental results on KITTI and NYUv2 demonstrate that our method outperforms the state-of-the-art methods.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] GuideFormer: Transformers for Image Guided Depth Completion
    Rho, Kyeongha
    Ha, Jinsung
    Kim, Youngjung
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 6240 - 6249
  • [32] TCIA: A Transformer-CNN Model With Illumination Adaptation for Enhancing Cell Image Saliency and Contrast
    Yang, Jietao
    Huang, Guoheng
    Luo, Yanzhang
    Zhang, Xiaofeng
    Yuan, Xiaochen
    Chen, Xuhang
    Pun, Chi-Man
    Cai, Mu-Yan
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [33] Learning Guided Convolutional Network for Depth Completion
    Tang, Jie
    Tian, Fei-Peng
    Feng, Wei
    Li, Jian
    Tan, Ping
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1116 - 1129
  • [34] Guided Spatial Propagation Network for Depth Completion
    Chen, Long
    Li, Qing
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 12608 - 12614
  • [35] TCNet: A Transformer-CNN Hybrid Network for Marine Aquaculture Mapping from VHSR Images
    Fu, Yongyong
    Zhang, Wenjia
    Bi, Xu
    Wang, Ping
    Gao, Feng
    REMOTE SENSING, 2023, 15 (18)
  • [36] Transformer-CNN: Swiss knife for QSAR modeling and interpretation
    Pavel Karpov
    Guillaume Godin
    Igor V. Tetko
    Journal of Cheminformatics, 12
  • [37] Transformer-CNN: Swiss knife for QSAR modeling and interpretation
    Karpov, Pavel
    Godin, Guillaume
    Tetko, Igor V.
    JOURNAL OF CHEMINFORMATICS, 2020, 12 (01)
  • [38] A concise but high-performing network for image guided depth completion in autonomous driving
    Liu, Moyun
    Chen, Bing
    Chen, Youping
    Xie, Jingming
    Yao, Lei
    Zhang, Yang
    Zhou, Joey Tianyi
    KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [39] TCC-SemCom: A Transformer-CNN Complementary Block-Based Image Semantic Communication
    Cheng, Guo
    Chong, Baolin
    Lu, Hancheng
    IEEE COMMUNICATIONS LETTERS, 2025, 29 (03) : 625 - 629
  • [40] DEPTH GUIDED IMAGE COMPLETION FOR STRUCTURE AND TEXTURE SYNTHESIS
    Ciotta, Michael
    Androutsos, Dimitrios
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 1199 - 1203