SA-MVSNet: Self-attention-based multi-view stereo network for 3D reconstruction of images with weak texture

被引:3
|
作者
Yang, Ronghao [1 ]
Miao, Wang [1 ]
Zhang, Zhenxin [2 ,3 ]
Liu, Zhenlong [1 ]
Li, Mubai [2 ,3 ]
Lin, Bin [1 ]
机构
[1] Chengdu Univ Technol, Coll Earth Sci, Chengdu 610059, Sichuan, Peoples R China
[2] Capital Normal Univ, Key Lab 3D Informat Acquisit & Applicat, MOE, Beijing 100048, Peoples R China
[3] Capital Normal Univ, Coll Resource Environm & Tourism, Beijing 100048, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Multi-view stereo; Depth estimation; Self-attention; Transformer; Weak texture; Adaptive propagation;
D O I
10.1016/j.engappai.2023.107800
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-view stereo (MVS) reconstruction is a key task of image-based 3D reconstruction, and deep learning-based methods can achieve better results than traditional algorithms. However, most of the current deep learning-based MVS methods use convolutional neural networks (CNNs) to extract image features, which cannot achieve the aggregation of long-distance context information and capture robust global information. In addition, in the process of fusing depth maps into point clouds, the confidence filters will filter out the depth values with low confidence in weak texture areas. These problems will lead to the low completeness of 3D reconstruction of weak texture and texture-less areas. To address the above problems, this paper proposes SA-MVSNet based on the PatchmatchNet with a self-attentive mechanism. First, we design a coarse-to-fine network framework to advance depth map estimation. In the feature extraction network, a module with a pyramid structure based on Swin Transformer Block is used to replace the original Feature Pyramid Network (FPN), and the self-correlation between weak texture areas is enhanced by applying a global self-attention mechanism. Then, we also propose a self-attention-based adaptive propagation module (SA-AP), which applies a self-attention calculation within depth value propagation window to obtain the relative weight values of current pixel and others, and then adaptively samples the depth values of neighbors on the same surface for propagation. Experiments show that SA-MVSNet has significantly improved the completeness of 3D reconstruction for the images with weak texture on DTU (provided by Danish Technical University), BlendedMVS, and Tanks and Temple datasets. Our code is available at https://github.com/miaowang525/SA-MVSNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Accurate stereo 3D point cloud generation suitable for multi-view stereo reconstruction
    Kordelas, Georgios A.
    Daras, Petros
    Klavdianos, Patrycia
    Izquierdo, Ebroul
    Zhang, Qianni
    2014 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING CONFERENCE, 2014, : 307 - 310
  • [42] A Scaled Monocular 3D Reconstruction Based on Structure from Motion and Multi-View Stereo
    Zhan, Zhiwen
    Yang, Fan
    Jiang, Jixin
    Du, Jialin
    Li, Fanxing
    Sun, Si
    Wei, Yan
    ELECTRONICS, 2024, 13 (19)
  • [43] A real sense 3D face reconstruction system based on multi-view stereo vision
    Li, Ke
    Zeng, Dong
    Zhang, Jun
    Lin, Rui
    Gao, Luobin
    Liao, Xiaoli
    Journal of Information and Computational Science, 2015, 12 (10): : 3739 - 3753
  • [44] AN AUTOMATIC 3D RECONSTRUCTION METHOD BASED ON MULTI-VIEW STEREO VISION FOR THE MOGAO GROTTOES
    Xiong, Jie
    Zhong, Sidong
    Zheng, Lin
    INDOOR-OUTDOOR SEAMLESS MODELLING, MAPPING AND NAVIGATION, 2015, 44 (W5): : 171 - 176
  • [45] DETransMVSnet: Research on Terahertz 3D Reconstruction of Multi-View Stereo Network With Deep Equilibrium Transformers
    Bai, Fan
    Li, Lun
    Wang, Wencheng
    Wu, Xiaojin
    IEEE ACCESS, 2023, 11 : 146042 - 146053
  • [46] Research on automatic 3D reconstruction of plant phenotype based on Multi-View images
    Yang, Danni
    Yang, Huijun
    Liu, Dongfeng
    Wang, Xianlin
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 220
  • [47] Incremental Multi-view 3D Reconstruction Starting from Two Images Taken by a Stereo Pair of Cameras
    El Hazzat, Soulaiman
    Saaidi, Abderrahim
    Karam, Antoine
    Satori, Khalid
    3D RESEARCH, 2015, 6 (01)
  • [48] Multi-view dual attention network for 3D object recognition
    Wenju Wang
    Yu Cai
    Tao Wang
    Neural Computing and Applications, 2022, 34 : 3201 - 3212
  • [49] Multi-view dual attention network for 3D object recognition
    Wang, Wenju
    Cai, Yu
    Wang, Tao
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (04): : 3201 - 3212
  • [50] MHSAN: Multi-view hierarchical self-attention network for 3D shape recognition
    Cao, Jiangzhong
    Yu, Lianggeng
    Ling, Bingo Wing-Kuen
    Yao, Zijie
    Dai, Qingyun
    PATTERN RECOGNITION, 2024, 150